NO

Author Topic: Reading a tab-delimited text file into a two-dimensional array  (Read 38714 times)

Offline TimoVJL

  • Global Moderator
  • Member
  • *****
  • Posts: 2091
Re: Reading a tab-delimited text file into a two-dimensional array
« Reply #45 on: May 02, 2014, 02:41:53 PM »
StopTimer returns elapsedTime in msecs.
May the source be with you

Offline jj2007

  • Member
  • *
  • Posts: 536
Re: Reading a tab-delimited text file into a two-dimensional array
« Reply #46 on: May 02, 2014, 03:03:35 PM »
OK, got it:
Loading 43123 rows took 0.3145 seconds
Loading 43123 rows took 0.3132 seconds
Loading 43123 rows took 0.3124 seconds
Loading 43123 rows took 0.3103 seconds
Loading 43123 rows took 0.3123 seconds
Loading 43123 rows took 309 milliseconds
Loading 43123 rows took 313 milliseconds
Loading 43123 rows took 310 milliseconds
Loading 43123 rows took 310 milliseconds
Loading 43123 rows took 313 milliseconds


No big variation here...

czerny

  • Guest
Re: Reading a tab-delimited text file into a two-dimensional array
« Reply #47 on: May 02, 2014, 05:12:12 PM »
I have for example the following timings:
Code: [Select]
for (i=0; i<10; i++)
{
  LoadTabFileJJ();
  Czernys();
  LoadTabFileJJ();
}

158.8260519144 ms       180.9129372588 ms       160.1033092195 ms
162.1560840833 ms       181.4037817656 ms       163.1542556386 ms
224.5581491502 ms       205.7775499400 ms       207.7685851135 ms
210.3840013186 ms       227.3836479217 ms       242.3699101422 ms
193.2228816791 ms       222.2067329786 ms       178.7003655493 ms
177.3895844304 ms       342.9254276731 ms       188.9189827199 ms
180.6846959600 ms       280.0210641297 ms       245.3736438570 ms
189.7260685366 ms       221.5546948006 ms       182.0309564484 ms
178.0321241946 ms       213.9358493887 ms       244.7769199717 ms
242.6847546266 ms       228.9170830371 ms       242.0028243813 ms

Hier with Timos code:

160 ms  180 ms  162 ms
252 ms  249 ms  211 ms
207 ms  237 ms  180 ms
181 ms  218 ms  181 ms
245 ms  223 ms  208 ms
207 ms  281 ms  196 ms
183 ms  224 ms  287 ms
236 ms  1373 ms 454 ms
182 ms  1248 ms 486 ms
180 ms  2177 ms 492 ms

The last three lines may be caused by swapping?

Here is Timos and my timer in parallel:

Code: [Select]
for (i=0; i<10; i++)
{
   LoadTabFileJJ();
}

163 ms  163.1665477037 ms
220 ms  220.7501232699 ms
192 ms  192.2875672746 ms
165 ms  165.9534940893 ms
220 ms  220.4213105297 ms
197 ms  197.9391235478 ms
211 ms  211.4291062132 ms
2579 ms 2579.4286704036 ms
2684 ms 2684.3026138797 ms
3112 ms 3112.7481286029 ms

Here both timers a consistent. But what about the variations?
« Last Edit: May 02, 2014, 05:30:00 PM by czerny »

Offline jj2007

  • Member
  • *
  • Posts: 536
Re: Reading a tab-delimited text file into a two-dimensional array
« Reply #48 on: May 02, 2014, 05:35:15 PM »
Swapping is somewhat unlikely, for the handful of bytes we are dealing with. Here are my timings for the old Celeron notebook with a lousy 2GB of RAM, tested with MbTimer() and Start/StopTimer():

Loading 43123 rows took 0.203 seconds with LoadTabFileJJ
Loading 43123 rows took 0.196 seconds with LoadTabFileJJ
Loading 43123 rows took 0.200 seconds with LoadTabFileJJ
Loading 43123 rows took 0.203 seconds with LoadTabFileJJ
Loading 43123 rows took 0.200 seconds with LoadTabFileJJ
Loading 43123 rows took 196 milliseconds with LoadTabFileJJ
Loading 43123 rows took 197 milliseconds with LoadTabFileJJ
Loading 43123 rows took 197 milliseconds with LoadTabFileJJ
Loading 43123 rows took 200 milliseconds with LoadTabFileJJ
Loading 43123 rows took 198 milliseconds with LoadTabFileJJ

czerny

  • Guest
Re: Reading a tab-delimited text file into a two-dimensional array
« Reply #49 on: May 02, 2014, 05:39:12 PM »
Here is the 2-timer test above with an somewhat quicker machine:

131 ms  131.1705817359 ms
143 ms  143.6193706183 ms
130 ms  130.6548737340 ms
131 ms  131.1560547500 ms
147 ms  147.2851996553 ms
131 ms  131.0951531549 ms
130 ms  130.4649054559 ms
130 ms  130.3785816354 ms
130 ms  130.7406388242 ms
131 ms  131.2108103125 ms

czerny

  • Guest
Re: Reading a tab-delimited text file into a two-dimensional array
« Reply #50 on: May 02, 2014, 05:50:23 PM »
Hey JJ2007, I got thouse 4 digit measurements not any more after freeing your buffer!

255 ms  255.4034102100 ms
223 ms  223.6789871338 ms
197 ms  197.7851933695 ms
215 ms  215.1318114453 ms
222 ms  222.2874694968 ms
197 ms  197.4552631689 ms
210 ms  210.9949728248 ms
235 ms  235.9663029798 ms
183 ms  183.3612931252 ms
192 ms  192.9044054482 ms

Here 5 timers: Timo, GetTickCount, timeGetTime, _rdtsc, my


590 ms  594 ms  591 ms  1414755126 cycles       590.9614210745 ms
167 ms  172 ms  168 ms  400770772 cycles        167.4084276074 ms
166 ms  156 ms  167 ms  398143312 cycles        166.3121988968 ms
167 ms  156 ms  168 ms  401748552 cycles        167.8168594053 ms
203 ms  203 ms  203 ms  485991200 cycles        203.0087622868 ms
197 ms  188 ms  197 ms  472113756 cycles        197.2110980586 ms
245 ms  235 ms  245 ms  586719924 cycles        245.0811485817 ms
200 ms  203 ms  202 ms  480600504 cycles        200.7559620008 ms
276 ms  281 ms  277 ms  663128552 cycles        276.9974954917 ms
2083 ms 2078 ms 2083 ms 693108120 cycles        2083.9235154189 ms

Look at the last two lines. There is a big difference in ms but a small difference in cycles?
And, JJ2007, there is a 4-digit time again! :-(
« Last Edit: May 02, 2014, 06:26:31 PM by czerny »

Offline TimoVJL

  • Global Moderator
  • Member
  • *****
  • Posts: 2091
Re: Reading a tab-delimited text file into a two-dimensional array
« Reply #51 on: May 02, 2014, 07:04:28 PM »
As i'm hobbyist, i shall share this test project too.

32-bit:
Process: 58 ms
43123 nRows

64-bit:
Process: 71 ms
43123 nRows

Bit slow, but who cares :D
May the source be with you

Offline jj2007

  • Member
  • *
  • Posts: 536
Re: Reading a tab-delimited text file into a two-dimensional array
« Reply #52 on: May 02, 2014, 07:52:03 PM »
Bit slow, but who cares :D

Practically on par with my algo (MbRecall is fast but doesn't count in this context).

Offline jj2007

  • Member
  • *
  • Posts: 536
Re: Reading a tab-delimited text file into a two-dimensional array
« Reply #53 on: May 02, 2014, 07:54:01 PM »
Hey JJ2007, I got thouse 4 digit measurements not any more after freeing your buffer!
Which buffer?

Quote
Look at the last two lines. There is a big difference in ms but a small difference in cycles?
And, JJ2007, there is a 4-digit time again! :-(

Mysterious. Doesn't make any sense 8)

Offline TimoVJL

  • Global Moderator
  • Member
  • *****
  • Posts: 2091
Re: Reading a tab-delimited text file into a two-dimensional array
« Reply #54 on: May 02, 2014, 09:04:52 PM »
Thank's Jochen for that BrainTrainer material.
That database.tab was good learning material for seeing some weakness of Excel and LibreOffice.
I hope that beginners enjoy this journey in those csv/tbs/tab files :)

Thank's to Pelle for PellesC  to make this exercise.

Thank's Lara Fabian's CD LARA FABIAN to be good mood when programming :).

When i go to my daily walk into fields, Paulina Rubio and Laura Pausini puts my legs in right beat :D
And i have Patricia Kaas and Middle of The Road (if sun shines Soley Soley) in my MP3 player too  8)
« Last Edit: May 02, 2014, 11:49:14 PM by TimoVJL »
May the source be with you

czerny

  • Guest
Re: Reading a tab-delimited text file into a two-dimensional array
« Reply #55 on: May 02, 2014, 09:46:11 PM »
Here is my last version: I had a lot of fun with this little exercise, too.
« Last Edit: May 02, 2014, 09:48:28 PM by czerny »

Offline jj2007

  • Member
  • *
  • Posts: 536
Re: Reading a tab-delimited text file into a two-dimensional array
« Reply #56 on: May 03, 2014, 12:25:47 AM »
Thanks a lot to you, Timo and Czerny and Robert, for your help. I've learnt some new tricks ;-)

And you forced me to update my assembler library, because it didn't treat quotes and commas correctly. It's ok now, although not yet online.

BTW in case you wondered how MbRecall can be 5* as fast: It always reads whole lines and stores only the beginning and len. Then, if you ask for element[5,9], it goes to row 5 and extracts the substring between the 5th and the 6th tab.

@Timo: Soley Soley - reminds me of the good times when I hung around in discos. You might like The Rain, the Park and Other Things ;-)

P.S.: New test files here, with:
- original csv file
- tab file converted by Excel 2010
- tab file produced by my own converter

When testing csv or tab readers, pay attention to the treatment of commas and quotes inside cells. I attach a file in *.xls format that has some goodies for testing.
« Last Edit: May 05, 2014, 09:28:46 AM by jj2007 »

Offline jj2007

  • Member
  • *
  • Posts: 536
Spreadsheet sorting
« Reply #57 on: March 12, 2016, 10:36:42 AM »
Just an update: I have added a "sort by numerical value of column n" function to my Recall() and QSort() procedures, see here. It's assembler, but if anybody needs a Pelles C version, let me know.