Pelles C forum

General => Chit-Chat => Topic started by: czerny on July 03, 2014, 01:29:15 pm

Title: frequency of ints in huge array
Post by: czerny on July 03, 2014, 01:29:15 pm
I have a little problem. May be there is an intelligent  solution I don't know?

I have a huge (typically 20000000 elements) array of unsigned long ints and I have to find that int with the highest frequency.

My idea so far is to use some sort of binary tree with something like

Code: [Select]
struct node {
  unsigned int x;
  unsigned int freq;
  struct node *left;
  struct node *right;
};

as node. But thisway I have 5 times as much memory to allocate.
Is there a cheaper way to do this?

Title: Re: frequency of ints in huge array
Post by: jj2007 on July 03, 2014, 02:41:49 pm
Have you tried sorting the array? On my trusty AMD, that takes three seconds. Afterwards, you just go through the array and check how often a value is being repeated, and if it's repeat count is higher than the previous champion, you take that value.
Title: Re: frequency of ints in huge array
Post by: Bitbeisser on July 03, 2014, 07:40:25 pm
Have you tried sorting the array? On my trusty AMD, that takes three seconds. Afterwards, you just go through the array and check how often a value is being repeated, and if it's repeat count is higher than the previous champion, you take that value.
Well, that would only work if the original order of the array doesn't matter, a fact that czerny unfortunately did not elude to if that is a restriction or not.
Also he did not mention if time is an issue here either, otherwise I would suggest just start counting from the top down, which could be reasonable fast depending on the distribution of values within the array.
Might also want to look into a modification of a binary search over the array...

Ralf
Title: Re: frequency of ints in huge array
Post by: jj2007 on July 04, 2014, 12:36:23 am
I would suggest just start counting from the top down

That sounds straightforward, can you post an example?
Title: Re: frequency of ints in huge array
Post by: czerny on July 04, 2014, 09:50:37 am
Have you tried sorting the array? On my trusty AMD, that takes three seconds. Afterwards, you just go through the array and check how often a value is being repeated, and if it's repeat count is higher than the previous champion, you take that value.
I do not want to change the original array. So I would have to make a copy. :(
Title: Re: frequency of ints in huge array
Post by: czerny on July 04, 2014, 09:59:06 am
I would suggest just start counting from the top down, ...
Do you mean something like that?
Code: [Select]
maxcount = 0;
for (i=0; i<MAX; i++) {
  count = 0;
  for (k=i; k<MAX; k++)
    if (a[k] == a[i]) count++;
  if (count > maxcount) {
    maxcount = count;
    index = i;
  }
}
This is very slow for huge arrays! O(n²)
Title: Re: frequency of ints in huge array
Post by: jj2007 on July 04, 2014, 11:03:16 am
This is very slow for huge arrays! O(n²)

Indeed. If you have enough RAM, sorting a copy is a lot faster. Quicksort does it at O(n log n), a radix sort can even be faster, depending on the data. Finding the most frequent element after the sort takes a negligible time, see below.

AMD Athlon(tm) Dual Core Processor 4450B (MMX, SSE, SSE2, SSE3)
2596 ms for sorting
77 ms for finding
Title: Re: frequency of ints in huge array
Post by: czerny on July 04, 2014, 11:25:58 am
Indeed. If you have enough RAM, sorting a copy is a lot faster. Quicksort does it at O(n log n), a radix sort can even be faster, depending on the data. Finding the most frequent element after the sort takes a negligible time, see below.

AMD Athlon(tm) Dual Core Processor 4450B (MMX, SSE, SSE2, SSE3)
2596 ms for sorting
77 ms for finding

Plus the time to copy the array (which is small) and 2 * memory.
But by all means better than the tree.
Title: Re: frequency of ints in huge array
Post by: czerny on July 04, 2014, 11:45:52 am
I would suggest just start counting from the top down, ...
Do you mean something like that?
Code: [Select]
maxcount = 0;
for (i=0; i<MAX; i++) {
  count = 0;
  for (k=i; k<MAX; k++)
    if (a[k] == a[i]) count++;
  if (count > maxcount) {
    maxcount = count;
    index = i;
  }
}
This is very slow for huge arrays! O(n²)
May be, that this can be optimized. The search can be terminated if maxcount > (MAX-i). But this depends on the data.
Title: Re: frequency of ints in huge array
Post by: JohnF on July 04, 2014, 11:55:32 am
czerny, see if this is ok, it takes 92 msecs here.

Code: [Select]
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>

long long __cdecl StartTimer(void)
{
long long t1;
QueryPerformanceCounter((LARGE_INTEGER*)&t1);
return t1;
}

long long __cdecl StopTimer(long long t1)
{
long long t2;
long long frequency;
QueryPerformanceCounter((LARGE_INTEGER *) & t2);
QueryPerformanceFrequency((LARGE_INTEGER *) & frequency);
long long elapsedTime = (t2 - t1) * 1000 / frequency;
return (long long)elapsedTime;
}

int __cdecl main(void)
{
#define SIZE 20000000
unsigned long int i;
unsigned long int * arry;

arry = malloc(SIZE * sizeof(unsigned long int));
if(arry == NULL){
printf("Error, no mem!");
return 0;
}

for (i = 0; i<SIZE; i++)
{
arry[i] = rand()%10000;
}

long long t = StartTimer();
unsigned long int maximum;
int freq = 1;
maximum = arry[0];

for (i = 0; i < SIZE; i++)
{
if (arry[i] > maximum){
maximum = arry[i];
freq = 1;
}else if(arry[i] == maximum){
freq++;
}
}

printf("msecs %lld\n", StopTimer(t));
printf("maximum %d has occurences: %d times\n", maximum, freq);
free(arry);
return 0;
}

EDIT: freq = 1;

John
Title: Re: frequency of ints in huge array
Post by: czerny on July 04, 2014, 01:17:12 pm
czerny, see if this is ok, it takes 92 msecs here.

Code: [Select]
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>

long long __cdecl StartTimer(void)
{
long long t1;
QueryPerformanceCounter((LARGE_INTEGER*)&t1);
return t1;
}

long long __cdecl StopTimer(long long t1)
{
long long t2;
long long frequency;
QueryPerformanceCounter((LARGE_INTEGER *) & t2);
QueryPerformanceFrequency((LARGE_INTEGER *) & frequency);
long long elapsedTime = (t2 - t1) * 1000 / frequency;
return (long long)elapsedTime;
}

int __cdecl main(void)
{
#define SIZE 20000000
unsigned long int i;
unsigned long int * arry;

arry = malloc(SIZE * sizeof(unsigned long int));
if(arry == NULL){
printf("Error, no mem!");
return 0;
}

for (i = 0; i<SIZE; i++)
{
arry[i] = rand()%10000;
}

long long t = StartTimer();
unsigned long int maximum;
int freq = 1;
maximum = arry[0];

for (i = 0; i < SIZE; i++)
{
if (arry[i] > maximum){
maximum = arry[i];
freq = 1;
}else if(arry[i] == maximum){
freq++;
}
}

printf("msecs %lld\n", StopTimer(t));
printf("maximum %d has occurences: %d times\n", maximum, freq);
free(arry);
return 0;
}

EDIT: freq = 1;

John
Try this with: SIZE=3; arry = {5,3,3];

I want the result: maximum == 3; freq == 2;
Title: Re: frequency of ints in huge array
Post by: jj2007 on July 04, 2014, 01:28:58 pm
Plus the time to copy the array (which is small) and 2 * memory.

199 ms for copying
5800 ms for sorting
127 ms for finding


"copying" includes allocating and freeing the copy. The biggest chunk is clearly the sorting. What kind of data do you have? Does it spread over the whole 32-bit range?
Title: Re: frequency of ints in huge array
Post by: czerny on July 04, 2014, 03:01:48 pm
Plus the time to copy the array (which is small) and 2 * memory.

199 ms for copying
5800 ms for sorting
127 ms for finding


"copying" includes allocating and freeing the copy. The biggest chunk is clearly the sorting. What kind of data do you have? Does it spread over the whole 32-bit range?
Yes, there is no known subset.
Title: Re: frequency of ints in huge array
Post by: JohnF on July 04, 2014, 04:08:17 pm
czerny, sorry about my post, I should read more carefully.

This code takes just over 4 secs here. I found the main body of the code on the web.

Code: [Select]
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>

long long __cdecl StartTimer(void)
{
long long t1;
QueryPerformanceCounter((LARGE_INTEGER*)&t1);
return t1;
}

long long __cdecl StopTimer(long long t1)
{
long long t2;
long long frequency;
QueryPerformanceCounter((LARGE_INTEGER *) & t2);
QueryPerformanceFrequency((LARGE_INTEGER *) & frequency);
long long elapsedTime = (t2 - t1) * 1000 / frequency;
return elapsedTime;
}

int __cdecl CompareInt(const void * param1, const void * param2)
{
unsigned long int * pInt1 = (unsigned long int *)param1;
unsigned long int * pInt2 = (unsigned long int *)param2;

if(*pInt1 < *pInt2)
return 1;
else if(*pInt1 > *pInt2)
return -1;

return 0;
}

int __cdecl main(void)
{
#define SIZE 20000000
unsigned long int i;
unsigned long int * arry, * arry1;

arry = malloc(SIZE * sizeof(unsigned long int));
if(arry == NULL){
printf("Error, no mem!");
return 0;
}
arry1 = malloc(SIZE * sizeof(unsigned long int));
if(arry1 == NULL){
printf("Error, no mem!");
return 0;
}

for (i = 0; i<SIZE; i++)
{
arry[i] = rand()%10000;
}


long long t = StartTimer();

memcpy(arry1, arry, SIZE*sizeof(unsigned long int));

qsort(arry, SIZE, sizeof(unsigned long int), CompareInt);

    int previous = arry[0];
    int popular = arry[0];
    int count = 1;
    int maxCount = 1;

    for (int i = 1; i < SIZE; i++) {
        if (arry[i] == previous)
            count++;
        else {
            if (count > maxCount) {
                popular = arry[i-1];
                maxCount = count;
            }
            previous = arry[i];
            count = 1;
        }
    }
printf("msecs %lld\n", StopTimer(t));
printf("maximum %d has occured %d times\n", popular, maxCount);

free(arry);
free(arry1);
return 0;
}

If it's no good I'll stop. :)

John
Title: Re: frequency of ints in huge array
Post by: frankie on July 04, 2014, 05:27:36 pm
I would propose a different approach that is very efficient if the distribution is narrow (we have many ripetitions) in memory usage and time because we scan the main array only one time and the 'reduced' array many times.
If all values are expected to be gaussian this algorithm is a nightmare like the others  ;D

Code: [Select]
#define _DEBUG
#ifdef _DEBUG
#define PDEBUG(...) printf(__VA_ARGS__)
#else
#define PDEBUG(...)
#endif
#define ARRAY_SIZE 20000000
#define STEP    (sizeof(array)/(100*sizeof(DWORD))*sizeof(EVAL))

typedef struct
{
DWORD val;
int Count;
} EVAL, *LPEVAL;

DWORD array[ARRAY_SIZE];

BOOL ArrayMode(int a[], int arraySize, DWORD *Val, int *freq)
{
DWORD CurrSize = STEP; //Start with a small percentage
DWORD Idx      = 0; //Current free element
LPEVAL pE = malloc(CurrSize);
if (!pE)
return FALSE; //Memory fail

for (int i=0; i<sizeof(array)/sizeof(DWORD); i++)
{
DWORD j;
for (j=0; j<Idx; j++)
if (pE[j].val==a[i])
{
pE[j].Count++;
break;
}
if (j==Idx) //NotFound, we have to add the new element
{
//First check if we have room in the array
if (Idx>=CurrSize/sizeof(EVAL))
{ //We have to expand array
PDEBUG("Expanding array, Idx=%u\n", Idx);
getchar();
void *p = realloc(pE, CurrSize+STEP);
if (!p)
{
free(pE);
return FALSE;
}
pE = p;
CurrSize+=STEP;
}
//Add new element
pE[Idx].val = a[i];
pE[Idx].Count = 1;
PDEBUG("Adding element, val[%u]=%u\n", Idx, pE[Idx].val);
Idx++;
}
}
//Now search the mode of reduced array
DWORD IdxMax = 0;

for (DWORD i=0; i<Idx; i++)
if (pE[i].Count>pE[IdxMax].Count)
IdxMax = i;

//Copy results
*Val  = pE[IdxMax].val;
*freq = pE[IdxMax].Count;

//Free memory
free(pE);

return TRUE;
}

void test(void)
{
DWORD val;
int freq;

//Fill a test array
for (int i=0; i<sizeof(array)/sizeof(DWORD); i++)
#if 1
if (i<1000)
array[i] = 1;
else if (i<100000)
array[i] = 2;
else
array[i] = 3;
#else
array[i] = rand();
#endif
ArrayMode(array, ARRAY_SIZE, &val, &freq);
printf("Array mode: value=%u, freq=%d\n", val, freq);
}
Title: Re: frequency of ints in huge array
Post by: jj2007 on July 04, 2014, 06:18:06 pm
This code takes just over 4 secs here.

Time the qsort separately. It will be just under 4 secs ;-)

Quote
If it's no good I'll stop. :)

It seems you are doing exactly what I did in my tests with assembler. So don't stop - unless we see something faster, it seems the way to go ;-)
Title: Re: frequency of ints in huge array
Post by: Bitbeisser on July 05, 2014, 03:41:27 am
I would suggest just start counting from the top down, ...
Do you mean something like that?
Code: [Select]
maxcount = 0;
for (i=0; i<MAX; i++) {
  count = 0;
  for (k=i; k<MAX; k++)
    if (a[k] == a[i]) count++;
  if (count > maxcount) {
    maxcount = count;
    index = i;
  }
}
This is very slow for huge arrays! O(n²)
May be, that this can be optimized. The search can be terminated if maxcount > (MAX-i). But this depends on the data.
That's what I mentioned in my original reply. It highly depends on the spread of the values within the array.
I have done something similar years back, just need to dig it out, but I am not frequently at a computer, there's a long weekend here in the US of A and as I have no vacation time being self-employed, it's as close as any "time off" will be for a while...

Ralf
Title: Re: frequency of ints in huge array
Post by: JohnF on July 05, 2014, 08:36:51 am
This code takes just over 4 secs here.

Time the qsort separately. It will be just under 4 secs ;-)

Quote
If it's no good I'll stop. :)

It seems you are doing exactly what I did in my tests with assembler. So don't stop - unless we see something faster, it seems the way to go ;-)

I've tried two radix sorts and they were slower - maybe they were not great implementations, I don't know.

EDIT:

Changing the compare function slightly

Code: [Select]
int __cdecl CompareInt(const void * param1, const void * param2)
{
if(*(unsigned long int *)param1 < *(unsigned long int *)param2)
return 1;
else if(*(unsigned long int *)param1 > *(unsigned long int *)param2)
return -1;
else
return 0;
}

and setting maximize speed, the time comes down to 3.1 secs.
 
John
Title: Re: frequency of ints in huge array
Post by: jj2007 on July 05, 2014, 11:17:15 am
... setting maximize speed, the time comes down to 3.1 secs.

Sounds good. Which CPU? Mine is a Celeron M.
Title: Re: frequency of ints in huge array
Post by: JohnF on July 05, 2014, 11:26:59 am
... setting maximize speed, the time comes down to 3.1 secs.

Sounds good. Which CPU? Mine is a Celeron M.

Pentium(R) Dual-Core 2.7 GHz

John
Title: Re: frequency of ints in huge array
Post by: czerny on July 05, 2014, 01:43:47 pm
If it's no good I'll stop. :)
What do you want to stop? Stop breathing? Stop coding? Stop posting in the Pelles C forum?
Don't do it, please! ;)
Title: Re: frequency of ints in huge array
Post by: czerny on July 05, 2014, 02:06:46 pm
I would propose a different approach that is very efficient if the distribution is narrow (we have many ripetitions)
Try this (there are many (10000000 x 17)):
Code: [Select]
for (int i=0; i<sizeof(array)/sizeof(DWORD); i++)
  if (i%2) array[i] = 17; else array[i] = i;
Title: Re: frequency of ints in huge array
Post by: jj2007 on July 05, 2014, 02:46:40 pm
377   Celeron M (http://www.cpubenchmark.net/cpu_lookup.php?cpu=Intel+Celeron+M+1.60GHz&id=710)
1632  Pentium(R) Dual-Core 2.7 GHz (http://www.cpubenchmark.net/cpu.php?cpu=Intel+Pentium+E5400+%40+2.70GHz)
Title: Re: frequency of ints in huge array
Post by: JohnF on July 05, 2014, 04:31:01 pm
377   Celeron M (http://www.cpubenchmark.net/cpu_lookup.php?cpu=Intel+Celeron+M+1.60GHz&id=710)
1632  Pentium(R) Dual-Core 2.7 GHz (http://www.cpubenchmark.net/cpu.php?cpu=Intel+Pentium+E5400+%40+2.70GHz)

One could go higher! ;)

John
Title: Re: frequency of ints in huge array
Post by: JohnF on July 05, 2014, 04:43:29 pm
Czerny, if you can get Pelle's OpenMP stuff working with his quick_sort I believe it would speed up the sorting quite a bit.

However, I've not been able get it working with unsigned int's.

http://www.smorgasbordet.com/pellesc/zip/qsort.zip (http://www.smorgasbordet.com/pellesc/zip/qsort.zip)

The fastest is SORT_METHOD 3

In his example you can sort 20000000 floats in 1.6 secs.

John
Title: Re: frequency of ints in huge array
Post by: jj2007 on July 05, 2014, 05:01:34 pm
If you are not too disgusted by assembler, try the attachment. It runs either on the commandline argument, or (if there are no args) on a file called FatArray.dat
Title: Re: frequency of ints in huge array
Post by: JohnF on July 05, 2014, 05:40:12 pm
That's certainly fast, 1.6 secs in total here.

You could make a library that can be called from C?

All your best stuff in 32 and 64 bit libs. :)

John

Title: Re: frequency of ints in huge array
Post by: czerny on July 05, 2014, 07:00:20 pm
If you are not too disgusted by assembler, try the attachment. It runs either on the commandline argument, or (if there are no args) on a file called FatArray.dat
Hmm!

On my old machine I got for copying + sorting + finding:

jj2700  :  2843 ms
john     :  7780 ms (with qsort)
MySort : 37326 ms

I wonder why MySort is soooo slow!

Code: [Select]
void MySort(unsigned long arr[], unsigned long l, unsigned long r)
{
if (l<r) {
unsigned long h, k, last, first = l+1, p = arr[l];

last = l;
for (k=l+1; k<=r; k++) {
if (arr[k] < p) {
h = arr[k]; arr[k] = arr[first]; arr[first] = h;
last = first++;
}
}
if (last > l) {
arr[l] = arr[last]; arr[last] = p;
}

if (last > l) MySort(arr, l, last-1);
if (last < r) MySort(arr, last+1, r);
}
}

jj2700: Your code for sorting is not included. I have downloaded SetupMasmBasic and executed SetupMasmBasic.exe but I got only an empty dialog with 'last chance ...' as title and 5 files in a temp directory.

I have found a nonrecursiv asm version on the net: http://www.cs.hofstra.edu/~cscccl/csc111/qsort.asm (http://www.cs.hofstra.edu/~cscccl/csc111/qsort.asm)

I would like to learn, howto make this Pelles C compatible. This should be easy in that case because der is a wrapper function for C.

Edit: Ok, I have changed .386p to .386 and deleted the .stack directive (don't know if this was right?)!

But it runs: 4587 ms

Title: Re: frequency of ints in huge array
Post by: czerny on July 05, 2014, 07:03:16 pm
Czerny, if you can get Pelle's OpenMP stuff working with his quick_sort I believe it would speed up the sorting quite a bit.
I neither own a machine with some sort of multi core processor nor have I an operating system at home to run the newest Pelles C.  :'(
Title: Re: frequency of ints in huge array
Post by: frankie on July 05, 2014, 08:33:03 pm
I would propose a different approach that is very efficient if the distribution is narrow (we have many ripetitions)
Try this (there are many (10000000 x 17)):
Code: [Select]
for (int i=0; i<sizeof(array)/sizeof(DWORD); i++)
  if (i%2) array[i] = 17; else array[i] = i;
Yes, but this is not a narrow distribution anyway, you have 10M values equal and 10M values different...   :-\
To speed up the calculation you have to sort the array, one of the fastest ways (if you don't hate the math) is to sample at powers of 2 steps as described here (http://www.quora.com/Algorithms/What-is-the-most-efficient-algorithm-for-calculating-the-mode-of-an-array-of-integers)...  :P
P.S. if the time is relevant, and in your case with so large arrays of data it is the case, the suggestion from JJ to use assembler is appropriate.  ;)

EDIT: if you reduce to 20K different values it takes 350ms on my machine:
Code: [Select]
for (int i=0; i<sizeof(array)/sizeof(DWORD); i++)
  if (i%1000) array[i] = 17; else array[i] = i;
Title: Re: frequency of ints in huge array
Post by: jj2007 on July 05, 2014, 08:41:09 pm
I got only an empty dialog with 'last chance ...'

Sorry... the setup is a bit mean in that it requires that Masm32 is installed (http://masm32.com/installation.htm) - the reason being that an assembler library can do harm in the wrong hands. That's why there is this little obstacle.

Anyway, I'll see if I can provide you a *.lib file with the integer sort.
Title: Re: frequency of ints in huge array
Post by: JohnF on July 05, 2014, 08:49:53 pm
Here is Pelle's OPenMP quick_sort (SORT_METHOD  3, fastest) now working with unsigned int.

Here it completes a 20000000 sort in 1.35 secs.

For anyone but Czerny, that is.

John
Title: Re: frequency of ints in huge array
Post by: jj2007 on July 05, 2014, 09:53:33 pm
That is pretty fast! I managed to add the sort to a little library (attached), here is a test:

#include <stdio.h>
#include <windows.h>      // MessageBox
#pragma warn(disable:2216)         // retval never used
#pragma warn(disable:2118)          // para not referenced
#pragma comment(lib, "\\Masm32\\MasmBasic\\Res\\xTimer.lib")      // adjust to your path

extern void      __cdecl xTimerStart(void);            // uses QPC and returns
extern int      __stdcall xTimerEnd(void);            // microseconds

extern int __stdcall MbArrSort(int*pStart, int elements, int pKey, int mode);        // MasmBasic ArraySort

int main(int argc, char* argv[]) {

  int i, MyArr[6]={123, 789, 456, 111, 333, 222};

  xTimerStart();
  // mode=size (4, 8 ) or (32 and real) or (64 and ascending)
  MbArrSort(&MyArr[0], 6, 0, 4+64);      // size 4, integer, ascending
  printf("Sorting took %i microseconds\n", xTimerEnd());


  for ( i=0; i<6; i++ ) printf("%i\t%i\n", i, MyArr[ i ] )

  return MessageBox(0, "Sorted?", "Confirm:", MB_YESNO | MB_ICONINFORMATION);
}

What is rather odd is that the linker can't find MbArrSort unless xTimerStart is also being used. I often call assembler routines from C, it always works, and sort and timer are not connected in any way, so for the time being it's a mystery...

MbArrSort handles also QWORD, float and double arrays, see the mode comment.
Title: Re: frequency of ints in huge array
Post by: JohnF on July 06, 2014, 08:50:26 am
POLINK: fatal error: File not found: '\masm32\lib\masm32.lib'.

masm32.lib   
gdi32.lib   
user32.lib   
Comctl32.lib   
Comdlg32.lib   
shell32.lib   
oleaut32.lib   
ole32.lib   
msvcrt.lib   
masmbasic.lib   

You asked if I were disgusted by asm, well I'm getting close!

John
Title: Re: frequency of ints in huge array
Post by: jj2007 on July 06, 2014, 01:40:14 pm
POLINK: fatal error: File not found: '\masm32\lib\masm32.lib'.

John,

It seems there are some dependencies that require a Masm32 installation (http://masm32.com/masmdl.htm) - sorry for that.

In any case, Pelle's quicksort is probably fast enough. My sort is a radix sort, it is faster than the Masm32 library quicksort if the integer range is not fully used, but somewhat (10%?) slower if the full 32-bit are present in the array.
Title: Re: frequency of ints in huge array
Post by: JohnF on July 06, 2014, 02:33:22 pm
Yes Pelle's qsort is good enough. I've tried other implementations of qsort and Pelle's wins every time. In fact the C run time lib has many fast routines in it.

John
Title: Re: frequency of ints in huge array
Post by: czerny on July 06, 2014, 02:52:59 pm
Yes Pelle's qsort is good enough. I've tried other implementations of qsort and Pelle's wins every time. In fact the C run time lib has many fast routines in it.
Must be an asm-implementation. But I do not understand the speed factor 4 or 5 comparing MySort and qsort. What could be optimized?

My pivot element is the very first element in an intervall. This should be no big problem if the data is sort of random. One could choose the median as a better choice, but to find the median costs time.

What else could be optimized?
Title: Re: frequency of ints in huge array
Post by: JohnF on July 06, 2014, 03:01:32 pm
Yes Pelle's qsort is good enough. I've tried other implementations of qsort and Pelle's wins every time. In fact the C run time lib has many fast routines in it.
Must be an asm-implementation. But I do not understand the speed factor 4 or 5 comparing MySort and qsort. What could be optimized?

My pivot element is the very first element in an intervall. This should be no big problem if the data is sort of random. One could choose the median as a better choice, but to find the median costs time.

What else could be optimized?

You could try a different sort method, maybe it would better qsort on this occasion.

However, if I were you I'd pick the fastest option available which is probably Pelle's qsort - unless you can find a fast asm routine that is.

I've tried to better Pelle's CRT routines with asm but have always been beaten by the CRT lib.
 
EDIT: attached is M.S. qsort.c. I've not tried it.

John

Title: Re: frequency of ints in huge array
Post by: jj2007 on July 06, 2014, 04:07:04 pm
I've tried to better Pelle's CRT routines with asm but have always been beaten by the CRT lib.

The CRT routines are in general fast, and some cannot be improved with asm. However, many can be beaten easily with a factor 2-3. One reason is that most of them are still pre-SSE2 for backward compatibility.
Title: Re: frequency of ints in huge array
Post by: czerny on July 06, 2014, 04:17:25 pm
However, if I were you I'd pick the fastest option available which is probably Pelle's qsort - unless you can find a fast asm routine that is.
For my actual needs I will use qsort. That's ok.

But there is still a lack of understanding which bothers me. I do not expect to beat an asm-routine with c. I would expect a factor of, say 2. An factor 5 says to me: there must be some algorithmic improvements!
Title: Re: frequency of ints in huge array
Post by: czerny on July 06, 2014, 04:21:41 pm
... still pre-SSE2 for backward compatibility.
fortunately!
Title: Re: frequency of ints in huge array
Post by: frankie on July 06, 2014, 09:38:19 pm
I tryied another system to reduce the time without ordering the array.
It uses hashing.
You can play with hash table dimensions to reduce/extend the time or the memory used (they have opposite behaviour).
The time overheading is due, as clear to anybody, to the huge number of loops to check the array. The indexing technique used with hashing avoid loops and hash handling is widly repayed by faster access.
The project posted show how it works, but it is not optimized yet. You can get much better results with a faster hashing library that reduce memory access, and using rehashing to remove looping in bucket extension.
I hope this is useful or at least shows the right way...
Title: Re: frequency of ints in huge array
Post by: jj2007 on July 06, 2014, 11:01:14 pm
... still pre-SSE2 for backward compatibility.
fortunately!

Hmmm....: Pelles C Release Candidate 5 (for 32-bit Windows Vista/7/8) (http://www.smorgasbordet.com/pellesc/download.htm)

That's a much tougher obstacle IMHO ;-)
Title: Re: frequency of ints in huge array
Post by: czerny on July 06, 2014, 11:13:31 pm
... still pre-SSE2 for backward compatibility.
fortunately!

Hmmm....: Pelles C Release Candidate 5 (for 32-bit Windows Vista/7/8) (http://www.smorgasbordet.com/pellesc/download.htm)
What dou mean? 32-Bit?
That's a much tougher obstacle IMHO ;-)
Title: Re: frequency of ints in huge array
Post by: jj2007 on July 07, 2014, 12:03:59 am
No, the restriction to Vista and upwards:
- about 25% of all PCs run with XP
- about 2% run with CPUs that don't have SSE2...
Title: Re: frequency of ints in huge array
Post by: czerny on July 07, 2014, 12:59:45 am
I tryied another system to reduce the time without ordering the array.
It uses hashing.
You can play with hash table dimensions to reduce/extend the time or the memory used (they have opposite behaviour).
The time overheading is due, as clear to anybody, to the huge number of loops to check the array. The indexing technique used with hashing avoid loops and hash handling is widly repayed by faster access.
The project posted show how it works, but it is not optimized yet. You can get much better results with a faster hashing library that reduce memory access, and using rehashing to remove looping in bucket extension.
I hope this is useful or at least shows the right way...
Oh, that looks promising! A little bit slower than qsort, yet. But a lot of memory for randomized data
Title: Re: frequency of ints in huge array
Post by: JohnF on July 07, 2014, 04:04:06 pm
I tryied another system to reduce the time without ordering the array.
It uses hashing.
You can play with hash table dimensions to reduce/extend the time or the memory used (they have opposite behaviour).
The time overheading is due, as clear to anybody, to the huge number of loops to check the array. The indexing technique used with hashing avoid loops and hash handling is widly repayed by faster access.
The project posted show how it works, but it is not optimized yet. You can get much better results with a faster hashing library that reduce memory access, and using rehashing to remove looping in bucket extension.
I hope this is useful or at least shows the right way...

Hey Frankie, does your brain really work this way?  :o

Code: [Select]
HashTbl->buckets[nHashIndex].entry[HashTbl->buckets[nHashIndex].cnt-1].key  = key;
HashTbl->buckets[nHashIndex].entry[HashTbl->buckets[nHashIndex].cnt-1].data = data;
return (HashTbl->buckets[nHashIndex].entry[HashTbl->buckets[nHashIndex].cnt-1].key ? 1: 0);

I've had to simplify those lines.

Anyway, it's a neat demo, thanks.

John
Title: Re: frequency of ints in huge array
Post by: frankie on July 07, 2014, 08:37:20 pm
Oh, that looks promising! A little bit slower than qsort, yet. But a lot of memory for randomized data
If you want a fast algorithm and want to use small or no memory the right solution is ... magic   ;D
Title: Re: frequency of ints in huge array
Post by: frankie on July 07, 2014, 08:51:00 pm
Hey Frankie, does your brain really work this way?  :o

Code: [Select]
HashTbl->buckets[nHashIndex].entry[HashTbl->buckets[nHashIndex].cnt-1].key  = key;
HashTbl->buckets[nHashIndex].entry[HashTbl->buckets[nHashIndex].cnt-1].data = data;
return (HashTbl->buckets[nHashIndex].entry[HashTbl->buckets[nHashIndex].cnt-1].key ? 1: 0);

I've had to simplify those lines.

Anyway, it's a neat demo, thanks.

John
John, sometimes my brain works even worst that this  ;D
Seriously the lines you are showing makes sense because in the original library key and Data were pointers. key pointed to a  structure for the key holding the key itself, hashing method, lenght of key, and some other info., Data pointed to a buffer holding user data for the key.
I made a quick and dirty simplification assigning the to key the array value and at data the frequency, and fixing the hashing function as <value % size_of_hash_table> (% stay for modulus I want clarify considering the previous post on XOR operator  ;D ;D).
The ripetiono of addressing is not a problem, in an optimized compiler it is a branch of program tree, the compiler should be able to recognize the ripetion of code and execute it only one time. Incidentally PellesC generally don't do that  :'(, but there is time to improve.....
Anyway I think that all the points and possible solutions have been showed, and as I said before remains only the magic...  ;D
Title: Re: frequency of ints in huge array
Post by: JohnF on July 08, 2014, 07:26:46 am
Oh, I thought % was for percent!  :D

John
Title: Re: frequency of ints in huge array
Post by: czerny on July 09, 2014, 07:05:55 pm
Another idea!
Title: Re: frequency of ints in huge array
Post by: JohnF on July 09, 2014, 08:01:02 pm
Another idea!

It seems to work ok, but not better than the original qsort we were playing with.

I've been messing with the various ways to deal with your problem and the old qsort method is probably the best you have. Frankie's hash, I think suffers from collisions when the value range is large, because of using the modus operator for nHashIndex.

And using a more complex hash function leads to other problems.

I managed to speed up Frankie's method but gave up when I saw problems with collisions.

Frankie may know of a way around the problems though.

John

Title: Re: frequency of ints in huge array
Post by: czerny on July 10, 2014, 09:59:52 am
Another idea!
It seems to work ok, but not better than the original qsort we were playing with.
It is much quicker, if there are many values with frequency 1.
Title: Re: frequency of ints in huge array
Post by: JohnF on July 10, 2014, 10:22:18 am
Another idea!
It seems to work ok, but not better than the original qsort we were playing with.
It is much quicker, if there are many values with frequency 1.

:)

EDIT: For a limited range of values there are very fast alternatives

Using Frankie's method with modus the routine for 20000000 values using rand()%10000 to fill the array only takes 203ms.

This code was inspired by Frankie's example, it uses a 2 dimensional array.

Code: [Select]
void insert(uint32_t value)
{
uint32_t nHashIndex = value % SIZE;

if(arry1[nHashIndex][VAL] != value){
arry1[nHashIndex][VAL] = value;
arry1[nHashIndex][FEQ]++;
}else{
arry1[nHashIndex][FEQ]++;
}
}

John
Title: Re: frequency of ints in huge array
Post by: frankie on July 11, 2014, 01:54:03 pm
Frankie's hash, I think suffers from collisions when the value range is large, because of using the modus operator for nHashIndex.

And using a more complex hash function leads to other problems.

I managed to speed up Frankie's method but gave up when I saw problems with collisions.

Frankie may know of a way around the problems though.

John
John, hashing always will generate collisions, this is the standard way it works on a finite size table.
In our case if you dimensions an hash table of 2^32 elements the hashing function h = Val % 2^32 will be a perfect hashing  ;D
You will have no collisions and direct access to all keys, but at price of an huge memory usage.
With a reduced dimensions table you can have some benefit of direct access and some loop iteration in extended buckets. In my sample using a table that is 1/100th of array syze = 200k values the maximum looping is 2^32/200k = 21475 elements (extended buckets).
To speed up this you can use, as I mentioned on my original post, a rehashing method.
Rehashing means that the extended buckets will be an hash subtable were a new hash will allow direct addressing. The sub-hash function can be Sub_hash = (Val%Main_Hash_Table_Size) % Hash_SubTable_Size, this will cover all indexes in the subtable. Still if the subtable is not big enough to get all values you will have extended buckets.
Proceeding this way you can have as many subtable as you want, and you can also decide to what extent keep the looping. The looping will be on how many possible extended buckets can exist after the permitted subhashing levels and their tables dimensions.
The advantage is that the subtables, as the extended buckets in standard hashing, will be created when necessary reducing the memory usage. Of course subhashing uses more memory of standard hashing because it can create a full dimension subtable even if only one value has to be stored, where the standard hashing would have created a single bucket.
Another optimizing way is to use the so called 'dynamic hashing', in this case the whole system is dynamic menaning that tables, subtables and hashing changes dynamically with the table contents. You can found info on dynamic hashing here (http://en.wikipedia.org/wiki/Dynamic_perfect_hashing) and here (http://www.cs.sfu.ca/CourseCentral/354/zaiane/material/notes/Chapter11/node20.html).

Anyway you have to consider also the cost of dynamic handling in terms of memory allocation, the more you play with memory OS function the more time you spend. The point is always the same the right method have to be a correct balance between speed and memory requirement.  8)

P.S. Be carefull with removing collisions, perfect hashing will lead to 2*sizeof(DWORD)*2^32 memory usage!
Title: Re: frequency of ints in huge array
Post by: JohnF on July 11, 2014, 03:31:57 pm
Frankie, thanks for the explanation.

John
Title: Re: frequency of ints in huge array
Post by: JohnF on July 11, 2014, 04:19:54 pm
Earlier we were talking about how it's difficult to beat the CRT library functions for speed.

Well, it's worth trying inline and maximum speed, some functions can be faster than the CRT.

I have not bothered with inlining much before as compilers are free not to inline the code, but it seems worth doing now with PellesC.

John
Title: Re: frequency of ints in huge array
Post by: JohnF on July 12, 2014, 04:04:58 pm
Yes Pelle's qsort is good enough. I've tried other implementations of qsort and Pelle's wins every time. In fact the C run time lib has many fast routines in it.
Must be an asm-implementation. But I do not understand the speed factor 4 or 5 comparing MySort and qsort. What could be optimized?

My pivot element is the very first element in an intervall. This should be no big problem if the data is sort of random. One could choose the median as a better choice, but to find the median costs time.

What else could be optimized?

czerny, here is a good qsort, it's only slightly slower than Pelle's qsort. Compile it maximum speed. It's from Tiny Lib C.

Code: [Select]
static void shortsort(char *lo, char *hi, size_t width, int (*comp)(const void *, const void *));
static void swap(char *p, char *q, size_t width);

#define CUTOFF 8
#define STKSIZ (8*sizeof(void*) - 2)

void q_sort(void *base, size_t num, size_t width, int (*comp) (const void *, const void *))
{
/**************************************************************
  Note: the number of stack entries required is no more than
   1 + log2(num), so 30 is sufficient for any array
***************************************************************/
char *lo, *hi; // ends of sub-array currently sorting
char *mid; // points to middle of subarray
char *loguy, *higuy; // traveling pointers for partition step
size_t size; // size of the sub-array
char *lostk[STKSIZ], *histk[STKSIZ];
int stkptr; // stack for saving sub-array to be processed

if (num < 2)
return; /* nothing to do */

stkptr = 0; /* initialize stack */

lo = (char *)base;
hi = (char *)base + width * (num - 1); /* initialize limits */

/* this entry point is for pseudo-recursion calling: setting
   lo and hi and jumping to here is like recursion, but stkptr is
   preserved, locals aren't, so we preserve stuff on the stack */
  recurse:

size = (hi - lo) / width + 1; /* number of el's to sort */

/* below a certain size, it is faster to use a O(n^2) sorting method */
if (size <= CUTOFF)
{
shortsort(lo, hi, width, comp);
}
else
{
/* First we pick a partitioning element.  The efficiency of the
   algorithm demands that we find one that is approximately the median
   of the values, but also that we select one fast.  We choose the
   median of the first, middle, and last elements, to avoid bad
   performance in the face of already sorted data, or data that is made
   up of multiple sorted runs appended together.  Testing shows that a
   median-of-three algorithm provides better performance than simply
   picking the middle element for the latter case. */

mid = lo + (size / 2) * width; /* find middle element */

/* Sort the first, middle, last elements into order */
if (comp(lo, mid) > 0)
swap(lo, mid, width);
if (comp(lo, hi) > 0)
swap(lo, hi, width);
if (comp(mid, hi) > 0)
swap(mid, hi, width);

/* We now wish to partition the array into three pieces, one consisting
   of elements <= partition element, one of elements equal to the
   partition element, and one of elements > than it.  This is done
   below; comments indicate conditions established at every step. */

loguy = lo;
higuy = hi;

/* Note that higuy decreases and loguy increases on every iteration,
   so loop must terminate. */
for (;;)
{
/* lo <= loguy < hi, lo < higuy <= hi,
   A[i] <= A[mid] for lo <= i <= loguy,
   A[i] > A[mid] for higuy <= i < hi,
   A[hi] >= A[mid] */

/* The doubled loop is to avoid calling comp(mid,mid), since some
   existing comparison funcs don't work when passed the same
   value for both pointers. */

if (mid > loguy)
{
do
{
loguy += width;
} while (loguy < mid && comp(loguy, mid) <= 0);
}
if (mid <= loguy)
{
do
{
loguy += width;
} while (loguy <= hi && comp(loguy, mid) <= 0);
}

/* lo < loguy <= hi+1, A[i] <= A[mid] for lo <= i < loguy,
   either loguy > hi or A[loguy] > A[mid] */

do
{
higuy -= width;
} while (higuy > mid && comp(higuy, mid) > 0);

/* lo <= higuy < hi, A[i] > A[mid] for higuy < i < hi,
   either higuy == lo or A[higuy] <= A[mid] */

if (higuy < loguy)
break;

/* if loguy > hi or higuy == lo, then we would have exited, so
   A[loguy] > A[mid], A[higuy] <= A[mid],
   loguy <= hi, higuy > lo */

swap(loguy, higuy, width);

/* If the partition element was moved, follow it.  Only need
   to check for mid == higuy, since before the swap,
   A[loguy] > A[mid] implies loguy != mid. */

if (mid == higuy)
mid = loguy;

/* A[loguy] <= A[mid], A[higuy] > A[mid]; so condition at top
   of loop is re-established */
}

/* A[i] <= A[mid] for lo <= i < loguy,
   A[i] > A[mid] for higuy < i < hi,
   A[hi] >= A[mid]
   higuy < loguy
   implying:
   higuy == loguy-1
   or higuy == hi - 1, loguy == hi + 1, A[hi] == A[mid] */

/* Find adjacent elements equal to the partition element.  The
   doubled loop is to avoid calling comp(mid,mid), since some
   existing comparison funcs don't work when passed the same value
   for both pointers. */

higuy += width;
if (mid < higuy)
{
do
{
higuy -= width;
} while (higuy > mid && comp(higuy, mid) == 0);
}
if (mid >= higuy)
{
do
{
higuy -= width;
} while (higuy > lo && comp(higuy, mid) == 0);
}

/* OK, now we have the following:
   higuy < loguy
   lo <= higuy <= hi
   A[i]  <= A[mid] for lo <= i <= higuy
   A[i]  == A[mid] for higuy < i < loguy
   A[i]  >  A[mid] for loguy <= i < hi
   A[hi] >= A[mid] */

/* We've finished the partition, now we want to sort the subarrays
   [lo, higuy] and [loguy, hi].
   We do the smaller one first to minimize stack usage.
   We only sort arrays of length 2 or more. */

if (higuy - lo >= hi - loguy)
{
if (lo < higuy)
{
lostk[stkptr] = lo;
histk[stkptr] = higuy;
++stkptr;
} /* save big recursion for later */

if (loguy < hi)
{
lo = loguy;
goto recurse; /* do small recursion */
}
}
else
{
if (loguy < hi)
{
lostk[stkptr] = loguy;
histk[stkptr] = hi;
++stkptr; /* save big recursion for later */
}

if (lo < higuy)
{
hi = higuy;
goto recurse; /* do small recursion */
}
}
}

/* We have sorted the array, except for any pending sorts on the stack.
   Check if there are any, and do them. */

--stkptr;
if (stkptr >= 0)
{
lo = lostk[stkptr];
hi = histk[stkptr];
goto recurse; /* pop subarray from stack */
}
else
return; /* all subarrays done */
}

static void shortsort(char *lo, char *hi, size_t width, int (*comp) (const void *, const void *))
{
char *p, *max;

/* Note: in assertions below, i and j are alway inside original bound of
   array to sort. */

while (hi > lo)
{
/* A[i] <= A[j] for i <= j, j > hi */
max = lo;
for (p = lo + width; p <= hi; p += width)
{
/* A[i] <= A[max] for lo <= i < p */
if (comp(p, max) > 0)
max = p;
/* A[i] <= A[max] for lo <= i <= p */
}

/* A[i] <= A[max] for lo <= i <= hi */

swap(max, hi, width);

/* A[i] <= A[hi] for i <= hi, so A[i] <= A[j] for i <= j, j >= hi */

hi -= width;

/* A[i] <= A[j] for i <= j, j > hi, loop top condition established */
}
/* A[i] <= A[j] for i <= j, j > lo, which implies A[i] <= A[j] for i < j,
   so array is sorted */
}

static void swap(char *a, char *b, size_t width)
{
char tmp = 0;

if (a != b)
// Do the swap one character at a time to avoid potential alignment problems.
while (width--)
{
tmp = *a;
*a++ = *b;
*b++ = tmp;
}
}

John
Title: Re: frequency of ints in huge array
Post by: czerny on July 14, 2014, 06:34:05 pm
Thank you John,

on my old machine it is even quicker:

Pelles qsort: 7035 ms, TinyLib q_sort: 6267 ms (5486 ms for same with unsigned long swap)

I will try to rewrite it for a special unsigned long version with inlined swaps.

Edit: I am a little bit confused about the following statement:

   /**************************************************************
     Note: the number of stack entries required is no more than
      1 + log2(num), so 30 is sufficient for any array
   ***************************************************************/

If num = 2^32 --> 1 + log2(num) = 33 > 30
Title: Re: frequency of ints in huge array
Post by: JohnF on July 14, 2014, 08:22:58 pm
Thank you John,

on my old machine it is even quicker:

Pelles qsort: 7035 ms, TinyLib q_sort: 6267 ms (5486 ms for same with unsigned long swap)

I will try to rewrite it for a special unsigned long version with inlined swaps.

Edit: I am a little bit confused about the following statement:

   /**************************************************************
     Note: the number of stack entries required is no more than
      1 + log2(num), so 30 is sufficient for any array
   ***************************************************************/

If num = 2^32 --> 1 + log2(num) = 33 > 30

The TinyLib qsort is based on the Microsoft one but with improvements.

As for this >> 1 + log2(num) = 33

I can't comment on it what the guy says. Whatever, that would be very large array indeed.
 
Change
#define STKSIZ (8*sizeof(void*) - 2)
to
#define STKSIZ (8*sizeof(void*))

John
Title: Re: frequency of ints in huge array
Post by: JohnF on July 17, 2014, 04:20:12 pm
For anyone interested in OpenMP stuff.

Attached is a small project that shows how to use OpenMP with the CRT qsort.

When OPENMPTEST is defined as 0 no OpenMP stuff is used, when 1, it is used.

timings are 0 > 3285ms, 1 > 1969ms.

John
Title: Re: frequency of ints in huge array
Post by: neo313 on July 17, 2014, 04:35:01 pm
For anyone interested in OpenMP stuff.

Attached is a small project that shows how to use OpenMP with the CRT qsort.

When OPENMPTEST is defined as 0 no OpenMP stuff is used, when 1, it is used.

timings are 0 > 3285ms, 1 > 1969ms.

John

I get this error:  (90): error #2918: [omp] Unrecognized or unsupported form of 'for' construct.

I'm using RC5
Title: Re: frequency of ints in huge array
Post by: JohnF on July 17, 2014, 05:11:35 pm
For anyone interested in OpenMP stuff.

Attached is a small project that shows how to use OpenMP with the CRT qsort.

When OPENMPTEST is defined as 0 no OpenMP stuff is used, when 1, it is used.

timings are 0 > 3285ms, 1 > 1969ms.

John

I get this error:  (90): error #2918: [omp] Unrecognized or unsupported form of 'for' construct.

That's strange. I can't say what's wrong.

EDIT: Are you using the latest version of PellesC?

John