gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

C/C++ > Need help rearranging an array

#169683 - DiscoStew - Wed Jul 29, 2009 6:30 pm

So I have this array that I want to reorganize. It currently looks like this (the 2D representation, then the 1D representation)...

Code:
 ___________    ___________________________________________________________
| 1 | 6 | B |  | 1 | 6 | B | 2 | 7 | C | 3 | 8 | D | 4 | 9 | E | 5 | A | F |
| 2 | 7 | C |   -----------------------------------------------------------
| 3 | 8 | D |
| 4 | 9 | E |
| 5 | A | F |
 -----------


...and what I want to do is rearrange it to look like this...

Code:
 ___________    ___________________________________________________________
| 1 | 2 | 3 |  | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F |
| 4 | 5 | 6 |   -----------------------------------------------------------
| 7 | 8 | 9 |
| A | B | C |
| D | E | F |
 -----------


The array can be of different sizes (and the number of rows and columns are known for each). While it seems like something simple with the use of allocating a new array and copying to there (and de-allocating the old), I'm using arrays far larger that the example here, and am trying not to go about with that method (mainly because I'm trying to prevent excessive gaps in the heap of Main RAM).

Would anyone have any info pertaining to this type of restructuring? Thx in advance.
_________________
DS - It's all about DiscoStew

#169684 - Dwedit - Wed Jul 29, 2009 8:01 pm

Look for "in place matrix transpose", there are algorithms for that.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."

#169692 - Pete_Lockwood - Thu Jul 30, 2009 1:48 pm

Did you find something for this?

Out of interest, is this transformation something you need to do many times at high speed for a large number of equally dimensioned objects? Or few times? Or small number of objects? Or lots of differently sized objects? Or speed isn't of the essence?

It's trivial to auto-generate a LUT of a size slightly below w*h/2 for a rectangular array or w*h/4 for a square, which would give you a very fast way to achieve the translation but wouldn't be so practical if you have lots of objects that are different shapes.

It also feels intuitively as though there should be a method of doing this that would use probably at most 'w' temporary elements, but not knowing your underlying requirements I didn't look further into it.

I know the problem has been solved by others but it's kinda interesting to think about..
_________________
It's not an illusion, it just looks like one.

#169693 - sgeos - Thu Jul 30, 2009 3:25 pm

Pete_Lockwood wrote:
I know the problem has been solved by others but it's kinda interesting to think about..

But, as you mentioned, we don't know what the problem is exactly, so the best answer can not be given.

You need to sort your array. How depends on exactly what you are doing.

#169695 - Pete_Lockwood - Thu Jul 30, 2009 4:31 pm

sgeos wrote:
But, as you mentioned, we don't know what the problem is exactly, so the best answer can not be given.


Hence my questions :)
_________________
It's not an illusion, it just looks like one.

#169696 - DiscoStew - Thu Jul 30, 2009 4:50 pm

What I wanted to do was what Dwedit suggested, which is in place matrix transposing, or in other words, swapping columns to rows in a rectangular matrix. Unfortunately, when I searched for something like that online, I mostly got examples that involved copying to another array.

The size of my arrays have been known to range from a few kBytes to upwards of almost 600kBytes. It's all part of data loading, so speed isn't necessarily an issue.
_________________
DS - It's all about DiscoStew

#169697 - Miked0801 - Thu Jul 30, 2009 5:18 pm

Wow, that is an interesting problem. My first attempt (fail) is here

Code:

    int matrix[5][3] = {{0x1,0x6,0xB},{0x2,0x7,0xC},{0x3,0x8,0xD},{0x4,0x9,0xE},{0x5,0xA,0xF}};

    int (*matrixDest)[3][5];
    (int *)matrixDest = (int *)(&matrix);

    for(int i=0; i<5; i++)
    {
        for(int j=0; j<3; j++)
            {
                int temp = (*matrixDest)[j][i];
                (*matrixDest)[j][i] = matrix[i][j];
                matrix[i][j] = temp;
            }
    }


Works until I double assign something after a swap. Hmm. Thinking...

#169698 - Miked0801 - Thu Jul 30, 2009 5:37 pm

A huge LUT would make it easier, but with 600k arrays, that becomes a problem on GBA/DS. But if it's done on PC. Hmmm.

Ok, I've got a solution, but it needs a little RAM. Can you afford 1 bit per array element of temp memory? It may need a lot less, but my second idea should work. Give me a few to code and test it...

#169699 - Cearn - Thu Jul 30, 2009 6:41 pm

I think I may have a solution that needs one row of temp space and works in W*W*H time, but I'm really thinking out loud here.

Example with Width W=5, Height H=3:

Code:
0 1 2
3 4 5     0 3 6 9 C
6 7 8  -> 1 4 7 A D
9 A B     2 5 8 B E
C D E

That is:

From
    0 1 2 3 4 5 6 7 8 9 A B C D E
To
    0 3 6 9 C 1 4 7 A D 2 5 8 B E


Algorithm:

0 1 2 3 4 5 6 7 8 9 A B C D E          Start.
0 3 6 9 C                              Gather Col-0 in temp array (offset=3) ; H time
          1 2 4 5 7 8 A B D E          Compress other elements ; W*(H-1) time
0 3 6 9 C 1 2 4 5 7 8 A B D E          Insert temp row. Col-0 done.
          1 4 7 A D                    Col-1 to temp (offset=2) ; H time
                    2 5 8 B E          Compress other elements ; (W-1)*(H-2) time
          1 4 7 A D 2 5 8 B E          Insert temp. Col-1 done. Col-2 as well because
                                       it's the last one.

It's still kinda wasteful, but it should be fairly easy to implement.

Are you sure the transposition is necessary though? You can basically swap the order of matrix-traversal by swapping x and y coordinates. Cache might not like it, but it's an easy solution. Or you could pre-transpose everything before the NDS ever gets its hands on it.

What are the number of elements we're talking about here? (width and height, not just total matrix size)

#169700 - elhobbs - Thu Jul 30, 2009 7:01 pm

what about iterating over all of the elements in the array in sequential order from 0 to widthxlength and calculating the index values in the original array and the transposed array? if the indexes are different then swap

if index is the sequential item from 0 to total count (in this example 15)
if the original array is 3 columns by 5 rows then

original
index0 = index/3
index1 = index % 3

if the destination array is 5 x 3
new
index0 = index%3
index1 = index/3

#169701 - Karatorian - Thu Jul 30, 2009 7:25 pm

Ok, maybe I'm crazy, but isn't this the sort of thing you should be preproccessing at build time? Or is that why you're asking?

Or do you need the array both ways? If you need it both ways, it seems like it'd be better to use some method to share the storage space and "virtualize" the indexing for the lesser used case.

#169702 - Miked0801 - Thu Jul 30, 2009 7:40 pm

Elhobbs's solution is very close to mine. The problem is double swapping indexes. To solve it, you take a start location and put it in a temp. You go to it's final destination and swap with the temp. you then go to the new value's final location and swap again. do while you haven't visited that location (hence the 1-bit visited values). Start walking up the array until you find a value not swapped and repeat until complete. For the 5x3 example, 0,0 swaps with 0,0 and you're done as you've already swapped the target. The next of 1,0 swaps with 0,1 -> 1,2 ->3,2 ->3,0 ->1,0 which is already swapped so you are done. 0,2 also swaps out 6 values which just leaves 2,1 and 4,2 - both which swap themselves.

I've run out of time to debug my code solution, but the above algorithm is my idea. It only takes 1 element of storage, a few counters, and the bool / bit array to know where you've visited. It could also be there there is a deeper pattern with visiting such that you always know that there are either no swaps or (Major Axis size) + 1 swaps needed to move stuff - which would make storage trivial. I'm not sure if that pattern holds outside of 5x3 and 4x4 though.

#169704 - Miked0801 - Thu Jul 30, 2009 7:45 pm

Just realized that were basically talking about a sort here (I'm slow I know.) As such, sorting via the 'new' index order should be a matter of calling any canned sort and going from there. My sort runs o(N) time (wee!) and requires N swaps. Pretty good. Any sort would do though if you didn't caree about run time much.

#169705 - Pete_Lockwood - Thu Jul 30, 2009 7:59 pm

To sort by the index, don't you need w*h extra storage elements to hold the index that you'll sort (which I thought was outside the scope of the request) - and you can find better ways to do it because your array isn't random unsorted data.


Separate thought: looks rather like any translation you make at the beginning of the array can also be made, mirrored, at the end of the array.
_________________
It's not an illusion, it just looks like one.

#169706 - Miked0801 - Thu Jul 30, 2009 8:40 pm

There are many in-place sorts. Shell sort comes to mind for efficiency. A properly coded QSort will also not eat too much space up. But because we have good preknowledge, we can get away with linear time and no big overhead.

#169708 - elhobbs - Thu Jul 30, 2009 9:55 pm

Miked0801 wrote:
Just realized that were basically talking about a sort here (I'm slow I know.) As such, sorting via the 'new' index order should be a matter of calling any canned sort and going from there. My sort runs o(N) time (wee!) and requires N swaps. Pretty good. Any sort would do though if you didn't caree about run time much.
removed to avoid embarassing myself

#169711 - Miked0801 - Thu Jul 30, 2009 11:42 pm

Bah - I put my first, very failed attempt up for fun. No harm in showing ideas :)

#169712 - keldon - Fri Jul 31, 2009 12:09 am

This is a little tricky if you don't have any memory to work with because of the whole "double assign something after a swap" issue. So what I have done is made the code work out where that swap has been moved to.

I have not compiled or tested this, it's more of me writing code to communicate my thinking of it. Although do bear in mind it's way past my bed-time so I'm just going to hope an explanation fairy will pop up and explain the in-between bits :D

Edit: I can confirm that this compiles, but does not work; result!
Edit: I have fixed it, and IT WORKS! ... please note that I do not typically code like this, that struct was ineffective and added nothing to the code.

Code:
typedef unsigned char u8;
typedef unsigned long u32;

// ----
// Function Declarations
// ----

template<class TYPE>
void swap( TYPE &a, TYPE &b );

template<class TYPE>
void swap( TYPE *array, size_t width, u32 source_x, u32 source_y, u32 dest_x, u32 dest_y );

u32 calculate_source_idx_from_dest_idx( size_t width, size_t height, u32 dest_idx );

void calculate_source_position( size_t width, size_t height, u32 dest_x, u32 dest_y, u32 &source_x, u32 &source_y );

template<class TYPE>
void reorganize_array( TYPE *array, size_t width, size_t height );



// ----
// Methods
// ----


template<class TYPE>
void swap( TYPE &a, TYPE &b )
{
   TYPE tmp = a;
   a = b;
   b = tmp;
} // swap

// ----
// Swap array elements
// ----
template<class TYPE>
void swap( TYPE *array, size_t width, u32 source_x, u32 source_y, u32 dest_x, u32 dest_y )
{
   swap( array[ source_x + ( source_y * width ) ], array[ dest_x + ( dest_y * width ) ] );
} // swap


// ----
// This method reorganizes an array according to the problem as defined in the
// following URL (http://forum.gbadev.org/viewtopic.php?p=169683#169683)
// ----
template<class TYPE>
void reorganize_array( TYPE *array, size_t width, size_t height )
{
   for( u32 dest_y = 0; dest_y < static_cast<u32>(height); ++dest_y )
   { 
      for( u32 dest_x = 0; dest_x < static_cast<u32>(width); ++dest_x )
      {
         u32 source_x, source_y;
         calculate_source_position( width, height, dest_x, dest_y, source_x, source_y );
         swap( array, width, source_x, source_y, dest_x, dest_y );
      }
   }
} // reorganize_array


// ----
// Calculates the source position based on the destination. This works by
// identifying the source array index that corresponds to a destination index.
// However if the source index is found to be less than the destination index
// then the algorithm knows that that element has been swapped, and then
// searches for where that index was swapped from.
// ----
void calculate_source_position( size_t width, size_t height, u32 dest_x, u32 dest_y, u32 &source_x, u32 &source_y )
{
   u32 dest_idx = ( dest_x + dest_y * static_cast<u32>( width ) );

   u32 source_idx = dest_idx;

   do
   {
      source_idx = calculate_source_idx_from_dest_idx( width, height, source_idx );
   }
   while( dest_idx > source_idx );

   source_x = source_idx % width;
   source_y = source_idx / width;

} // calculate_source_position


u32 calculate_source_idx_from_dest_idx( size_t width, size_t height, u32 dest_idx )
{
   u32 source_x = dest_idx / static_cast<u32>( height );
   u32 source_y = dest_idx % static_cast<u32>( height );

   return source_x + ( source_y * static_cast<u32>( width ) );
} // calculate_source_idx_from_dest_idx


typedef struct ARRAY_FORMAT
{
   size_t width;
   size_t height;
}
ARRAY_FORMAT;


template<class T>
void fill_source_array( T *array, const ARRAY_FORMAT &array_format )
{
   if( array == NULL ) return;

   for( u32 x = 0; x < static_cast<u32>( array_format.width ); ++x )
   {
      for( u32 y = 0; y < static_cast<u32>( array_format.height); ++y )
      {
         array[ x + (y * array_format.width) ] = static_cast<T>(y + (x * static_cast<u32>(array_format.height)) );
      }
   }
} // fill_source_array


template<class T>
void print_array( T *arr, const ARRAY_FORMAT &array_format )
{
   printf( ">>> Printing array\n" );
   
   for( int i = 0; i < static_cast<int>( array_format.width * array_format.height); ++i )
   {
      printf( "%d, ", arr[i] );
   }
   
   printf( "\n<<<\n\n" );
} // print_array


void test_array_transpose( size_t width, size_t height )
{
   if( (width * height) == 0 ) return;

   ARRAY_FORMAT array_format = { width, height };
   
   int *t_array = new int[ width * height ];
   
   fill_source_array( t_array, array_format );
   print_array( t_array, array_format );
   
   reorganize_array( t_array, array_format.width, array_format.height );
   print_array( t_array, array_format );
   
   delete []t_array;
   
   return;
   
} // test_array_transpose



int _tmain(int argc, _TCHAR* argv[])
{
   test_array_transpose( 32, 44 );
   (void) getchar();
}

#169718 - keldon - Fri Jul 31, 2009 10:03 am

Miked0801 wrote:
...
I've run out of time to debug my code solution, but the above algorithm is my idea. It only takes 1 element of storage, a few counters, and the bool / bit array to know where you've visited. It could also be there there is a deeper pattern with visiting such that you always know that there are either no swaps or (Major Axis size) + 1 swaps needed to move stuff - which would make storage trivial. I'm not sure if that pattern holds outside of 5x3 and 4x4 though ...


That would not work because numbers can be swapped more than once. Even with the example given the source 6 is swapped for the two, and then again when placing the fourth as it is in the way. You could add a check into my code to count this.

#169719 - Dwedit - Fri Jul 31, 2009 2:05 pm

People actually use Size_T and Static Cast? o_O
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."

#169730 - keldon - Fri Jul 31, 2009 8:49 pm

Dwedit wrote:
People actually use Size_T and Static Cast? o_O


Don't act like you've never done (or been a witness to) anything like that before. Besides, it's an acquired taste <_<

#169784 - Miked0801 - Mon Aug 03, 2009 5:56 pm

keldon wrote:
Miked0801 wrote:
...
I've run out of time to debug my code solution, but the above algorithm is my idea. It only takes 1 element of storage, a few counters, and the bool / bit array to know where you've visited. It could also be there there is a deeper pattern with visiting such that you always know that there are either no swaps or (Major Axis size) + 1 swaps needed to move stuff - which would make storage trivial. I'm not sure if that pattern holds outside of 5x3 and 4x4 though ...


That would not work because numbers can be swapped more than once. Even with the example given the source 6 is swapped for the two, and then again when placing the fourth as it is in the way. You could add a check into my code to count this.


My bool array was there to prevent this - read teh array before seapping. If true, go to next number.

This is a very fun problem :)

#169807 - keldon - Tue Aug 04, 2009 1:55 pm

It has just occurred to me that this algorithm efficiently transforms between interleaved and non-interleaved data streams. Interesting!

#169808 - sgeos - Tue Aug 04, 2009 3:03 pm

keldon wrote:
It has just occurred to me that this algorithm efficiently transforms between interleaved and non-interleaved data streams. Interesting!

This was the original goal. Half of solving a problem is properly defining it.

#169832 - DiscoStew - Wed Aug 05, 2009 7:34 pm

Thanks everyone for your help with this problem. I didn't want to create an entirely new array to copy to because of memory fragmentation, and I didn't want to restructure the entire thing externally because I wanted to keep the data in it's purest form.
_________________
DS - It's all about DiscoStew

#169834 - keldon - Wed Aug 05, 2009 8:55 pm

Hi DS, how was my implementation for you? It fits your requirements in regards to memory usage (i.e. it does not allocate new memory, or copy to a new destination).

Benchmarks for 600k (744 * 744)
- 600k swaps
- 900k indirections

I have also written reorganize_array() as a template method. I'm sure this can all be optimized heavily, for example if you know your sizes before hand then that can be a template parameter - allowing the compiler to optimize the divides or rewriting the code to use reciprocals, amongst other obvious optimizations.

#169839 - DiscoStew - Thu Aug 06, 2009 8:23 pm

Works pretty well, though I was kinda worried about the number of divides used. It's all loading time anyways. One thing I did forget to mention is that while the biggest array I've had was around 600Kbytes, the element size most of the time is 128 bytes, which makes for some good news in that there will be far less swapping of sections to get it all organized.

However, that is also the bad news, and I've only recently noticed this after knowing more about the format I'm working with and fixing the bugs involved with it. It appears that most of the data I've been working with have a length that is divisible by 128, while the rest are not, which in turn screws up when trying to sort those arrays.

That's not to say that the algorithm can't be used. It just means that I can only use it on those arrays that have a length divisible by 128 (which is at least 90% of the files I'm using), and the rest will have to be sorted with one of the other non-efficient method, like copying to another array.
_________________
DS - It's all about DiscoStew