Harvard caches

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Harvard caches

nop head
I am moving some code from a MCF537x coldfire, which had a unified cache, to a MCF547x which has separate instruction and data caches.

My code has lots of tables declared as const so that they go into the read only text section in flash, rather than the initialised data section in RAM. Does this mean that all my table accesses will miss the cache because they will be in the range for the instruction cache, but are not instruction fetches?

Similarly I have a few instructions in my bss section, mainly jump instruction to redirect interrupts to the relevent device driver. Am I right in thinking that these will all be cache misses as well because they will be in the range of the data cache but are actually instruction fetches?

Is there any way I can set up the caches to emulate a unified cache or do I have to rewrite all my code?

Another thing I don't understand is the two 4K SRAMs. The manual says I need to specifiy whether they are connected to the instruction bus or the data bus,. but it also gives RAMBAR address space settings for both code and data. Can I mix code and data in these RAMS, and if so, which bus do I specify and how does it work if they are one the wrong bus for the access?

TIA, Chris
[hidden email] Send a post to the list. [hidden email] Join the list. [hidden email] Join the list in digest mode. [hidden email] Leave the list.
Reply | Threaded
Open this post in threaded view
|

Re: Harvard caches

Nicolas Pinault
Hi,

Please, see my answer below.

> I am moving some code from a MCF537x coldfire, which had a unified
> cache, to a MCF547x which has separate instruction and data caches.
>
> My code has lots of tables declared as const so that they go into the
> read only text section in flash, rather than the initialised data
> section in RAM. Does this mean that all my table accesses will miss
> the cache because they will be in the range for the instruction cache,
> but are not instruction fetches?
>
> Similarly I have a few instructions in my bss section, mainly jump
> instruction to redirect interrupts to the relevent device driver. Am I
> right in thinking that these will all be cache misses as well because
> they will be in the range of the data cache but are actually
> instruction fetches?
>
> Is there any way I can set up the caches to emulate a unified cache or
> do I have to rewrite all my code?
In MCF5407, there are 4 ACRx registers.
ACR0 and ACR1 are for data cache.
ACR2 and ACR3 are for instruction cache

If you initialise ACR0 and ACR1 with the same value, data and
instruction caches cover the same address range.

Here is the code I use to initialise cache :
    // Invalidate the cache and disable it
    SetMCF5407CACR (MCF5407_CACR_DCINVA);
    SetMCF5407CACR (MCF5407_CACR_BCINVA);
    SetMCF5407CACR (MCF5407_CACR_ICINVA);
    SetMCF5407CACR (MCF5407_CACR_DCINVA    |
                    MCF5407_CACR_BCINVA |
                    MCF5407_CACR_ICINVA    );//|
                    //MCF5407_CACR_HSDIS);
                           
    // Setup ACRs so that if cache is turned on, only SDRAM and Flash
are cached
    SetMCF5407ACR0 (MCF5407_ACR_BASE((INT32U)__SDRAM_START) |
//                    MCF5407_ACR_MASK(0x00FFFFFF) |      // 16Mo
//                    MCF5407_ACR_MASK(0x01FFFFFF) |      // 32Mo
                    MCF5407_ACR_MASK(0x03FFFFFF) |      // 64Mo
//                    MCF5407_ACR_MASK(0x07FFFFFF) |      // 128Mo
//                    MCF5407_ACR_MASK((INT32U)__HEAP_END -
(INT32U)__SDRAM_START) |  // See size in link file
                    MCF5407_ACR_E          |
//                    MCF5407_ACR_CM(0)      |        // Write-through
                    MCF5407_ACR_CM(1)      |        // Copyback
                    MCF5407_ACR_S(2));
                     
    SetMCF5407ACR1 (0);

    SetMCF5407ACR2 (MCF5407_ACR_BASE((INT32U)__FLASH_START) |
                    MCF5407_ACR_MASK(0x00) |
                    MCF5407_ACR_E          |
                    MCF5407_ACR_CM(0)      |
                    MCF5407_ACR_S(2));

    SetMCF5407ACR3 (0x00FFC060);            // See errata

    // Enable and configure cache
    SetMCF5407CACR (
                    MCF5407_CACR_DEC      |
                    MCF5407_CACR_DESB     |
                    MCF5407_CACR_DDCM (2) |
                    MCF5407_CACR_BEC      |
                    //MCF5407_CACR_HSDIS    |
                    MCF5407_CACR_IEC      |
                    MCF5407_CACR_DNFB     //|
////                    MCF5407_CACR_IDCM
                    );

In my case, data cache covers only SDRAM space and instruction cache
cover full address space (ACR MASK is 0).
If you want instruction and data cache to cover the same address range,
set ACR0 and ACR2 with the same value.
Note : SetMCF5407ACRx() and other functions are custom made functions
very dependent on the compiler.
>
> Another thing I don't understand is the two 4K SRAMs. The manual says
> I need to specifiy whether they are connected to the instruction bus
> or the data bus,. but it also gives RAMBAR address space settings for
> both code and data. Can I mix code and data in these RAMS, and if so,
> which bus do I specify and how does it work if they are one the wrong
> bus for the access?
>
MCF5407 has 2 internal RAM blocks. 4K each. There are 2 RAMBAR registers
(RAMBAR0 and RAMBAR1), one for each internal RAM block.
Each internal RAM block can be independently mapped anywhere in address
space modulo granularity trough RAMBARx.
An internal RAM block can be connected either on instruction bus or on
data bus NOT both.
That is, an internal RAM block can be used either for code or for data
NOT both.
If you configure an internal RAM block for data and you try to run code
from it, you will get an exception. Same thing with an internal RAM
block configure for instruction and accessed for data.


Hope this helps.

Regards,
Nicolas

> TIA, Chris
> [hidden email] Send a post to the list.
> [hidden email] Join the list. [hidden email]
> Join the list in digest mode. [hidden email] Leave the list.
---
[hidden email]              Send a post to the list.
[hidden email]        Join the list.
[hidden email]    Join the list in digest mode.
[hidden email]     Leave the list.

Reply | Threaded
Open this post in threaded view
|

Re: Harvard caches

David Brown-4
In reply to this post by nop head
X-SpamDetect-Info: ------------- Start ASpam results ---------------
X-SpamDetect-Info: This message may be spam. This message BODY has been altered to show you the spam information
X-SpamDetect: ****: 4.000000 GreyPassed=1.0, DodgySource=2.0, SPF Default Fail=1.0
X-SpamDetect-Info: ------------- End ASpam results -----------------

nop head wrote:
> I am moving some code from a MCF537x coldfire, which had a unified
> cache, to a MCF547x which has separate instruction and data caches.
>
> My code has lots of tables declared as const so that they go into the
> read only text section in flash, rather than the initialised data
> section in RAM. Does this mean that all my table accesses will miss the
> cache because they will be in the range for the instruction cache, but
> are not instruction fetches?
>

Your table data will be read as data, and thus go in the data cache.
The same applies to any other constant data that is generated along with
the code (such as const values, strings, etc.).  Normally you would have
your instruction cache range covering either all memory (probably the
easiest), or all memory that can contain instructions.  Your data cache
range should cover all memory except perhaps peripheral areas.

> Similarly I have a few instructions in my bss section, mainly jump
> instruction to redirect interrupts to the relevent device driver. Am I
> right in thinking that these will all be cache misses as well because
> they will be in the range of the data cache but are actually instruction
> fetches?
>

Self-modifying code has been considered bad style for several decades.
I don't know your particular code or the problem you are trying to
solve, but as general advice you should try to use function pointers
rather than jump instructions in your table.

If you want to keep the jump instructions, your main problem is cache
synchronisation (ranges should be fixed as I said above).  When you
change one of these indirect jumps, that's a data write and the change
goes to the data cache.  The instruction cache knows nothing about the
change (they are not synchronised - synchronising costs a lot of logic
and latency).  You have to flush the relevant line of the data cache to
make sure the change is written out, then you must flush the matching
line in the instruction cache to make sure that you don't have old
values in the instruction cache.  As you can see, using pointers is much
easier.

> Is there any way I can set up the caches to emulate a unified cache or
> do I have to rewrite all my code?
>
> Another thing I don't understand is the two 4K SRAMs. The manual says I
> need to specifiy whether they are connected to the instruction bus or
> the data bus,. but it also gives RAMBAR address space settings for both
> code and data. Can I mix code and data in these RAMS, and if so, which
> bus do I specify and how does it work if they are one the wrong bus for
> the access?
>
> TIA, Chris
> [hidden email] Send a post to the list.
> [hidden email] Join the list. [hidden email]
> Join the list in digest mode. [hidden email] Leave the list.

---
[hidden email]              Send a post to the list.
[hidden email]        Join the list.
[hidden email]    Join the list in digest mode.
[hidden email]     Leave the list.

Reply | Threaded
Open this post in threaded view
|

Re: Harvard caches

nop head
In reply to this post by Nicolas Pinault
Hi Nicolas,
   Thanks very much for the detailed answer. I was worried that overlapping the caches would cause coherency problems, but thinking more about it that would only be the case when code was being written to, and a push of the data cache followed by invalidating the instruction cache should fix that.

So if the SRAM can only contain either code or data but not both then table 7-2 in section 7.6 is of the MCF547x reference manual makes no sense as it says set RAMBAR[5-0] to 0x21 if the data contained in SRAM contains both code and data.

Regards, Chris

2009/2/27 Nicolas Pinault <[hidden email]>
Hi,

Please, see my answer below.

I am moving some code from a MCF537x coldfire, which had a unified cache, to a MCF547x which has separate instruction and data caches.

My code has lots of tables declared as const so that they go into the read only text section in flash, rather than the initialised data section in RAM. Does this mean that all my table accesses will miss the cache because they will be in the range for the instruction cache, but are not instruction fetches?

Similarly I have a few instructions in my bss section, mainly jump instruction to redirect interrupts to the relevent device driver. Am I right in thinking that these will all be cache misses as well because they will be in the range of the data cache but are actually instruction fetches?

Is there any way I can set up the caches to emulate a unified cache or do I have to rewrite all my code?
In MCF5407, there are 4 ACRx registers.
ACR0 and ACR1 are for data cache.
ACR2 and ACR3 are for instruction cache

If you initialise ACR0 and ACR1 with the same value, data and instruction caches cover the same address range.

Here is the code I use to initialise cache :
  // Invalidate the cache and disable it
  SetMCF5407CACR (MCF5407_CACR_DCINVA);
  SetMCF5407CACR (MCF5407_CACR_BCINVA);
  SetMCF5407CACR (MCF5407_CACR_ICINVA);
  SetMCF5407CACR (MCF5407_CACR_DCINVA    |
                  MCF5407_CACR_BCINVA |
                  MCF5407_CACR_ICINVA    );//|
                  //MCF5407_CACR_HSDIS);
                            // Setup ACRs so that if cache is turned on, only SDRAM and Flash are cached
  SetMCF5407ACR0 (MCF5407_ACR_BASE((INT32U)__SDRAM_START) |
//                    MCF5407_ACR_MASK(0x00FFFFFF) |      // 16Mo
//                    MCF5407_ACR_MASK(0x01FFFFFF) |      // 32Mo
                  MCF5407_ACR_MASK(0x03FFFFFF) |      // 64Mo
//                    MCF5407_ACR_MASK(0x07FFFFFF) |      // 128Mo
//                    MCF5407_ACR_MASK((INT32U)__HEAP_END - (INT32U)__SDRAM_START) |  // See size in link file
                  MCF5407_ACR_E          |
//                    MCF5407_ACR_CM(0)      |        // Write-through
                  MCF5407_ACR_CM(1)      |        // Copyback
                  MCF5407_ACR_S(2));
                      SetMCF5407ACR1 (0);

  SetMCF5407ACR2 (MCF5407_ACR_BASE((INT32U)__FLASH_START) |
                  MCF5407_ACR_MASK(0x00) |
                  MCF5407_ACR_E          |
                  MCF5407_ACR_CM(0)      |
                  MCF5407_ACR_S(2));

  SetMCF5407ACR3 (0x00FFC060);            // See errata

  // Enable and configure cache
  SetMCF5407CACR (
                  MCF5407_CACR_DEC      |
                  MCF5407_CACR_DESB     |
                  MCF5407_CACR_DDCM (2) |
                  MCF5407_CACR_BEC      |
                  //MCF5407_CACR_HSDIS    |
                  MCF5407_CACR_IEC      |
                  MCF5407_CACR_DNFB     //|
////                    MCF5407_CACR_IDCM
                  );

In my case, data cache covers only SDRAM space and instruction cache cover full address space (ACR MASK is 0).
If you want instruction and data cache to cover the same address range, set ACR0 and ACR2 with the same value.
Note : SetMCF5407ACRx() and other functions are custom made functions very dependent on the compiler.


Another thing I don't understand is the two 4K SRAMs. The manual says I need to specifiy whether they are connected to the instruction bus or the data bus,. but it also gives RAMBAR address space settings for both code and data. Can I mix code and data in these RAMS, and if so, which bus do I specify and how does it work if they are one the wrong bus for the access?

MCF5407 has 2 internal RAM blocks. 4K each. There are 2 RAMBAR registers (RAMBAR0 and RAMBAR1), one for each internal RAM block.
Each internal RAM block can be independently mapped anywhere in address space modulo granularity trough RAMBARx.
An internal RAM block can be connected either on instruction bus or on data bus NOT both.
That is, an internal RAM block can be used either for code or for data NOT both.
If you configure an internal RAM block for data and you try to run code from it, you will get an exception. Same thing with an internal RAM block configure for instruction and accessed for data.


Hope this helps.

Regards,
Nicolas

TIA, Chris
[hidden email] Send a post to the list. [hidden email] Join the list. [hidden email] Join the list in digest mode. [hidden email] Leave the list.
---
[hidden email]              Send a post to the list.
[hidden email]        Join the list.
[hidden email]    Join the list in digest mode.
[hidden email]     Leave the list.


[hidden email] Send a post to the list. [hidden email] Join the list. [hidden email] Join the list in digest mode. [hidden email] Leave the list.
Reply | Threaded
Open this post in threaded view
|

Re: Harvard caches

nop head
In reply to this post by David Brown-4
Hi David,
  Looks like our posts crossed. Yes this self modifing code is a couple of decades old and was done to reduce interrupt latency. Since the processor is now about 200 times faster I could rewrite it to use pointers.

Regards, Chris


2009/2/27 David Brown <[hidden email]>
X-SpamDetect-Info: ------------- Start ASpam results ---------------
X-SpamDetect-Info: This message may be spam. This message BODY has been altered to show you the spam information X-SpamDetect: ****: 4.000000 GreyPassed=1.0, DodgySource=2.0, SPF Default Fail=1.0
X-SpamDetect-Info: ------------- End ASpam results -----------------


nop head wrote:
I am moving some code from a MCF537x coldfire, which had a unified cache, to a MCF547x which has separate instruction and data caches.

My code has lots of tables declared as const so that they go into the read only text section in flash, rather than the initialised data section in RAM. Does this mean that all my table accesses will miss the cache because they will be in the range for the instruction cache, but are not instruction fetches?


Your table data will be read as data, and thus go in the data cache. The same applies to any other constant data that is generated along with the code (such as const values, strings, etc.).  Normally you would have your instruction cache range covering either all memory (probably the easiest), or all memory that can contain instructions.  Your data cache range should cover all memory except perhaps peripheral areas.


Similarly I have a few instructions in my bss section, mainly jump instruction to redirect interrupts to the relevent device driver. Am I right in thinking that these will all be cache misses as well because they will be in the range of the data cache but are actually instruction fetches?


Self-modifying code has been considered bad style for several decades. I don't know your particular code or the problem you are trying to solve, but as general advice you should try to use function pointers rather than jump instructions in your table.

If you want to keep the jump instructions, your main problem is cache synchronisation (ranges should be fixed as I said above).  When you change one of these indirect jumps, that's a data write and the change goes to the data cache.  The instruction cache knows nothing about the change (they are not synchronised - synchronising costs a lot of logic and latency).  You have to flush the relevant line of the data cache to make sure the change is written out, then you must flush the matching line in the instruction cache to make sure that you don't have old values in the instruction cache.  As you can see, using pointers is much easier.

Is there any way I can set up the caches to emulate a unified cache or do I have to rewrite all my code?

Another thing I don't understand is the two 4K SRAMs. The manual says I need to specifiy whether they are connected to the instruction bus or the data bus,. but it also gives RAMBAR address space settings for both code and data. Can I mix code and data in these RAMS, and if so, which bus do I specify and how does it work if they are one the wrong bus for the access?

TIA, Chris
[hidden email] Send a post to the list. [hidden email] Join the list. [hidden email] Join the list in digest mode. [hidden email] Leave the list.

---
[hidden email]              Send a post to the list.
[hidden email]        Join the list.
[hidden email]    Join the list in digest mode.
[hidden email]     Leave the list.


[hidden email] Send a post to the list. [hidden email] Join the list. [hidden email] Join the list in digest mode. [hidden email] Leave the list.
Reply | Threaded
Open this post in threaded view
|

Re: Harvard caches

Nicolas Pinault
In reply to this post by nop head

Hi Nicolas,
   Thanks very much for the detailed answer. I was worried that overlapping the caches would cause coherency problems, but thinking more about it that would only be the case when code was being written to, and a push of the data cache followed by invalidating the instruction cache should fix that.

So if the SRAM can only contain either code or data but not both then table 7-2 in section 7.6 is of the MCF547x reference manual makes no sense as it says set RAMBAR[5-0] to 0x21 if the data contained in SRAM contains both code and data.
Yes and no. Bits [5-0] are the same for all chip-select configurations. CS0, CS1, FLASHBAR, RAMBAR... all use these bits.
If you configure RAMBAR0 for data access and you try to execute code from it you will get an exception.
If you configure RAMBAR0 for instruction access, connect to data bus, you will have another type of error. I'm not sure but I think it will hang the system by not having TA asserted.
Coniguring RAMBARx for both data and instruction access can be useful if you want to switch RAM for code to data and vice versa.

Nicolas


Regards, Chris

2009/2/27 Nicolas Pinault <[hidden email]>
Hi,

Please, see my answer below.

I am moving some code from a MCF537x coldfire, which had a unified cache, to a MCF547x which has separate instruction and data caches.

My code has lots of tables declared as const so that they go into the read only text section in flash, rather than the initialised data section in RAM. Does this mean that all my table accesses will miss the cache because they will be in the range for the instruction cache, but are not instruction fetches?

Similarly I have a few instructions in my bss section, mainly jump instruction to redirect interrupts to the relevent device driver. Am I right in thinking that these will all be cache misses as well because they will be in the range of the data cache but are actually instruction fetches?

Is there any way I can set up the caches to emulate a unified cache or do I have to rewrite all my code?
In MCF5407, there are 4 ACRx registers.
ACR0 and ACR1 are for data cache.
ACR2 and ACR3 are for instruction cache

If you initialise ACR0 and ACR1 with the same value, data and instruction caches cover the same address range.

Here is the code I use to initialise cache :
  // Invalidate the cache and disable it
  SetMCF5407CACR (MCF5407_CACR_DCINVA);
  SetMCF5407CACR (MCF5407_CACR_BCINVA);
  SetMCF5407CACR (MCF5407_CACR_ICINVA);
  SetMCF5407CACR (MCF5407_CACR_DCINVA    |
                  MCF5407_CACR_BCINVA |
                  MCF5407_CACR_ICINVA    );//|
                  //MCF5407_CACR_HSDIS);
                            // Setup ACRs so that if cache is turned on, only SDRAM and Flash are cached
  SetMCF5407ACR0 (MCF5407_ACR_BASE((INT32U)__SDRAM_START) |
//                    MCF5407_ACR_MASK(0x00FFFFFF) |      // 16Mo
//                    MCF5407_ACR_MASK(0x01FFFFFF) |      // 32Mo
                  MCF5407_ACR_MASK(0x03FFFFFF) |      // 64Mo
//                    MCF5407_ACR_MASK(0x07FFFFFF) |      // 128Mo
//                    MCF5407_ACR_MASK((INT32U)__HEAP_END - (INT32U)__SDRAM_START) |  // See size in link file
                  MCF5407_ACR_E          |
//                    MCF5407_ACR_CM(0)      |        // Write-through
                  MCF5407_ACR_CM(1)      |        // Copyback
                  MCF5407_ACR_S(2));
                      SetMCF5407ACR1 (0);

  SetMCF5407ACR2 (MCF5407_ACR_BASE((INT32U)__FLASH_START) |
                  MCF5407_ACR_MASK(0x00) |
                  MCF5407_ACR_E          |
                  MCF5407_ACR_CM(0)      |
                  MCF5407_ACR_S(2));

  SetMCF5407ACR3 (0x00FFC060);            // See errata

  // Enable and configure cache
  SetMCF5407CACR (
                  MCF5407_CACR_DEC      |
                  MCF5407_CACR_DESB     |
                  MCF5407_CACR_DDCM (2) |
                  MCF5407_CACR_BEC      |
                  //MCF5407_CACR_HSDIS    |
                  MCF5407_CACR_IEC      |
                  MCF5407_CACR_DNFB     //|
////                    MCF5407_CACR_IDCM
                  );

In my case, data cache covers only SDRAM space and instruction cache cover full address space (ACR MASK is 0).
If you want instruction and data cache to cover the same address range, set ACR0 and ACR2 with the same value.
Note : SetMCF5407ACRx() and other functions are custom made functions very dependent on the compiler.


Another thing I don't understand is the two 4K SRAMs. The manual says I need to specifiy whether they are connected to the instruction bus or the data bus,. but it also gives RAMBAR address space settings for both code and data. Can I mix code and data in these RAMS, and if so, which bus do I specify and how does it work if they are one the wrong bus for the access?

MCF5407 has 2 internal RAM blocks. 4K each. There are 2 RAMBAR registers (RAMBAR0 and RAMBAR1), one for each internal RAM block.
Each internal RAM block can be independently mapped anywhere in address space modulo granularity trough RAMBARx.
An internal RAM block can be connected either on instruction bus or on data bus NOT both.
That is, an internal RAM block can be used either for code or for data NOT both.
If you configure an internal RAM block for data and you try to run code from it, you will get an exception. Same thing with an internal RAM block configure for instruction and accessed for data.


Hope this helps.

Regards,
Nicolas

TIA, Chris
[hidden email] Send a post to the list. [hidden email] Join the list. [hidden email] Join the list in digest mode. [hidden email] Leave the list.
---
[hidden email]              Send a post to the list.
[hidden email]        Join the list.
[hidden email]    Join the list in digest mode.
[hidden email]     Leave the list.


[hidden email] Send a post to the list. [hidden email] Join the list. [hidden email] Join the list in digest mode. [hidden email] Leave the list.
[hidden email] Send a post to the list. [hidden email] Join the list. [hidden email] Join the list in digest mode. [hidden email] Leave the list.