User Tools

Site Tools


programming:general:phpvspythonvsperl

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
programming:general:phpvspythonvsperl [2008/02/01 21:07] – old revision restored crustymonkeyprogramming:general:phpvspythonvsperl [2008/04/14 13:55] (current) crustymonkey
Line 4: Line 4:
 ===== The Goal ===== ===== The Goal =====
 I was in a discussion yesterday with one of my co-workers about the speed of [[wp>Spamassassin]].  We were talking about how slow it is and good ways in which to speed it up.  He mentioned some optimizations in other languages, which got me wondering about exactly what the speed differences would be in a test of [[wp>PHP]], [[wp>Python]] and [[wp>Perl]].  This writeup details the results of my tests of 5 different scripts on 2 different machines running 2 different distros of [[wp>Linux]]. I was in a discussion yesterday with one of my co-workers about the speed of [[wp>Spamassassin]].  We were talking about how slow it is and good ways in which to speed it up.  He mentioned some optimizations in other languages, which got me wondering about exactly what the speed differences would be in a test of [[wp>PHP]], [[wp>Python]] and [[wp>Perl]].  This writeup details the results of my tests of 5 different scripts on 2 different machines running 2 different distros of [[wp>Linux]].
- 
  
 ===== The Hardware ===== ===== The Hardware =====
Line 345: Line 344:
 ===== The Results ===== ===== The Results =====
 The tables below show the Unix ''time'' output of these tests on each machine.  The fastest time for each language is in bold. The tables below show the Unix ''time'' output of these tests on each machine.  The fastest time for each language is in bold.
 +
  
  
 ==== The Sun Box ==== ==== The Sun Box ====
-| ^ PHP ^ Python (non-compiled) ^ Python (compiled) ^ Perl (interpolated string loop) ^ Perl (hard coded regexes) ^ +| ^ PHP ^^^ Python (non-compiled) ^^^ Python (compiled) ^^^ Perl (interpolated string loop) ^^^ Perl (hard coded regexes) ^^^ 
-^ Test 1 | **real: 9.45s**, user: 9.10s, sys: 0.07s | real: 13.42s, user: 12.66s, sys: 0.11s | real: 7.97s, user: 7.20s, sys: 0.11s | real: 31.89s**user: 29.43s**, sys: 0.17s | **real: 1.59s**, user: 1.53s, sys: 0.05s | +| |  **Real**  |  **User**  |  **Sys**  |  **Real**  |  **User**  |  **Sys**  |  **Real**  |  **User**  |  **Sys**  |  **Real**  |  **User**  |  **Sys**  |  **Real**  |  **User**  |  **Sys**  | 
-^ Test 2 | real: 9.90s, user: 9.06s**sys: 0.06s** | real: 13.28s, user: 12.45s, sys: 0.15s | real: 7.86s, user: 7.25s**sys: 0.01s** | real: 31.78s, user: 29.97s, sys: 0.18s | real: 1.67s, user: 1.52s, sys: 0.07s | +^ Test 1 |  **9.45s**  |  9.10s  |  0.07s   13.42s  |  12.66s  |  0.11s   7.97s  |  7.20s  |  0.11s   31.89s  |  **29.43s**  |  0.17s   **1.59s**  |  1.53s  |  0.05s  
-^ Test 3 | real: 9.58s, user: 9.07s**sys: 0.06s** | real: 13.56s, user: 12.59s, sys: 0.10s | real: 7.45s**user: 7.09s**, sys: 0.13s | real: 31.29s, user: 29.61s**sys: 0.14s** | real: 2.32s, user: 1.46s**sys: 0.04s** | +^ Test 2 |  9.90s  |  9.06s  |  **0.06s**   13.28s  |  12.45s  |  0.15s   7.86s  |  7.25s  |  **0.01s**   31.78s  |  29.97s  |  0.18s   1.67s  |  1.52s  |  0.07s  
-^ Test 4 | real: 9.52s, user: 9.08s, sys: 0.10s | real: 13.58s, user: 12.63s**sys: 0.09s** | **real: 7.40s**, user: 7.18s, sys: 0.04s | **real: 33.19s**, user: 30.27s, sys: 0.15s | real: 1.76s, user: 1.47s**sys: 0.04s** | +^ Test 3 |  9.58s  |  9.07s  |  **0.06s**   13.56s  |  12.59s  |  0.10s   7.45s  |  **7.09s**  |  0.13s   31.29s  |  29.61s  |  **0.14s**   2.32s  |  1.46s  |  **0.04s**  
-^ Test 5 | real: 9.94s**user: 8.87s**, sys: 0.12s | **real: 13.00s****user: 12.44s**, sys: 0.12s | real: 7.43s, user: 7.19s, sys: 0.08s | real: 33.22s, user: 30.22s, sys: 0.16s | real: 1.82s**user: 1.42s**, sys: 0.10s |+^ Test 4 |  9.52s  |  9.08s  |  0.10s   13.58s  |  12.63s  |  **0.09s**   **7.40s**  |  7.18s  |  0.04s   **33.19s**  |  30.27s  |  0.15s   1.76s  |  1.47s  |  **0.04s**  
 +^ Test 5 |  9.94s  |  **8.87s**  |  0.12s   **13.00s**  |  **12.44s**  |  0.12s   7.43s  |  7.19s  |  0.08s   33.22s  |  30.22s  |  0.16s   1.82s  |  **1.42s**  |  0.10s  |
  
 +==== The Laptop ====
 +| ^ PHP ^^^ Python (non-compiled) ^^^ Python (compiled) ^^^ Perl (interpolated string loop) ^^^ Perl (hard coded regexes) ^^^
 +| |  **Real**  |  **User**  |  **Sys**  |  **Real**  |  **User**  |  **Sys**  |  **Real**  |  **User**  |  **Sys**  |  **Real**  |  **User**  |  **Sys**  |  **Real**  |  **User**  |  **Sys**  |
 +^ Test 1 |  14.25s  |  14.05s  |  0.10s  |  **12.10s**  |  11.98s  |  0.06s  |  **6.12s**  |  6.03s  |  0.07s  |  **42.42s**  |  **42.11s**  |  0.11s  |  1.63s  |  **1.54s**  |  0.08s  |
 +^ Test 2 |  14.00s  |  13.62s  |  **0.08s**  |  12.27s  |  **11.91s**  |  **0.04s**  |  6.17s  |  6.08s  |  0.06s  |  43.02s  |  42.72s  |  0.09s  |  1.71s  |  1.64s  |  **0.05s**  |
 +^ Test 3 |  14.24s  |  **14.01s**  |  0.14s  |  12.43s  |  12.29s  |  0.06s  |  6.14s  |  **5.98s**  |  0.08s  |  43.15s  |  42.67s  |  0.15s  |  1.71s  |  1.65s  |  0.06s  |
 +^ Test 4 |  **13.94s**  |  13.62s  |  **0.08s**  |  12.21s  |  11.93s  |  0.11s  |  6.30s  |  6.22s  |  **0.05s**  |  43.25s  |  43.00s  |  **0.07s**  |  1.63s  |  1.58s  |  **0.05s**  |
 +^ Test 5 |  14.30s  |  14.06s  |  0.14s  |  12.24s  |  12.08s  |  0.09s  |  6.32s  |  6.19s  |  0.08s  |  31.89s  |  42.60s  |  0.20s  |  **1.61s**  |  1.55s  |  **0.05s**  |
  
 +===== Just For Fun... =====
 +...one of my co-workers whipped up this C code which uses ''libpcre'' just to see how it would perform versus the interpreted languages.  I'm not including it in the main results because this is a test of 3 interpreted languages speed capabilities, but I thought I would drop the results in here just for fun.
  
  
  
- 
- 
-==== The Laptop ==== 
-| ^ PHP ^ Python (non-compiled) ^ Python (compiled) ^ Perl (interpolated string loop) ^ Perl (hard coded regexes) ^ 
-^ Test 1 | real: 14.25s, user: 14.05s, sys: 0.10s | **real: 12.10s**, user: 11.98s, sys: 0.06s | **real: 6.12s**, user: 6.03s, sys: 0.07s | **real: 42.42s**, **user: 42.11s**, sys: 0.11s | real: 1.63s, **user: 1.54s**, sys: 0.08s | 
-^ Test 2 | real: 14.00s, user: 13.62s, **sys: 0.08s** | real: 12.27s, **user: 11.91s**, **sys: 0.04s** | real: 6.17s, user: 6.08s, sys: 0.06s | real: 43.02s, user: 42.72s, sys: 0.09s | real: 1.71s, user: 1.64s, **sys: 0.05s** | 
-^ Test 3 | real: 14.24s, **user: 14.01s**, sys: 0.14s | real: 12.43s, user: 12.29s, sys: 0.06s | real: 6.14s, **user: 5.98s**, sys: 0.08s | real: 43.15s, user: 42.67s, sys: 0.15s | real: 1.71s, user: 1.65s, sys: 0.06s | 
-^ Test 4 | **real: 13.94s**, user: 13.62s, **sys: 0.08s** | real: 12.21s, user: 11.93s, sys: 0.11s | real: 6.30s, user: 6.22s, **sys: 0.05s** | real: 43.25s, user: 43.00s, **sys: 0.07s** | real: 1.63s, user: 1.58s, **sys: 0.05s** | 
-^ Test 5 | real: 14.30s, user: 14.06s, sys: 0.14s | real: 12.24s, user: 12.08s, sys: 0.09s | real: 6.32s, user: 6.19s, sys: 0.08s | real: 31.89s, user: 42.60s, sys: 0.20s | **real: 1.61s**, user: 1.55s, **sys: 0.05s** | 
- 
-===== Just For Fun... ===== 
-...one of my co-workers whipped up this C code which uses ''libpcre'' just to see how it would perform versus the interpreted languages.  I'm not including it in the main results because this is a test of 3 interpreted languages speed capabilities, but I thought I would drop the results in here just for fun. 
  
 ==== The Code ==== ==== The Code ====
Line 403: Line 402:
                 re[i] = pcre_compile(pat[i], 0, &err_txt, &err_offset, 0);                 re[i] = pcre_compile(pat[i], 0, &err_txt, &err_offset, 0);
                 if (!re[i]) {                 if (!re[i]) {
-                        errx(1, "PCRE complie error at %d of %s: %s",+                        errx(1, "PCRE compile error at %d of %s: %s",
                                 err_offset, pat[i], err_txt);                                 err_offset, pat[i], err_txt);
                 }                 }
Line 416: Line 415:
                         if (match > 0) {                         if (match > 0) {
                                 counter++;                                 counter++;
-                        } +                                break;
-                } +
-        } +
- +
-        if (!(f = fopen(logfile, "r"))) return 1; +
- +
-        while ((s = fgets(buf, sizeof(buf), f))) { +
-                for (i = 0; pat[i]; i++) { +
-                        match = pcre_exec(re[i], 0, s, strlen(s), +
-                                        0, 0, ovec, 30); +
-                        if (match > 0) { +
-                                counter++;+
                         }                         }
                 }                 }
Line 444: Line 432:
 $ cc -Wall -o pcretest pcretest.c -I/usr/local/include -L/usr/local/lib -lpcre $ cc -Wall -o pcretest pcretest.c -I/usr/local/include -L/usr/local/lib -lpcre
 </code> </code>
 +
 +
 +
  
  
Line 450: Line 441:
 === The Sun Box === === The Sun Box ===
 | ^ Real ^ User ^ Sys ^ | ^ Real ^ User ^ Sys ^
-^ Test 1 | 7.93s 7.65s | 0.10s +^ Test 1 |  **6.70s**   **6.44s**   0.07s  
-^ Test 2 | 9.46s 8.54s | 0.10s +^ Test 2 |  8.03s   7.83s   **0.03s**  
-^ Test 3 | 8.28s 7.55s | 0.09s +^ Test 3 |  9.04s   6.84s   0.05s  
-^ Test 4 | 8.09s | 7.65s | 0.06s | +^ Test 4 |  9.03s  |  6.53s  |  0.09s  | 
-^ Test 7.93s 7.65s | 0.09s |+^ Test 5 |  7.26s  |  6.63s   0.08s 
 + 
 +=== The Laptop === 
 + 
 +| ^ Real ^ User ^ Sys ^ 
 +^ Test 1 |  13.14s  |  12.92s  |  0.06s  
 +^ Test  13.08s   **12.88s**   0.06s  | 
 +^ Test 3 |  13.09s   12.94s  |  **0.02s** 
 +^ Test 4 |  13.21s  |  13.00s  |  0.07s  | 
 +^ Test 5 |  **13.07s**  |  **12.88s**  |  0.04s  |
  
 ===== Conclusion ===== ===== Conclusion =====
Line 462: Line 462:
  
 I think that the most amazing thing here is difference in the 2 Perl tests.  If you use a scalar string variable as the regular expression, it's dog slow.  However, if you hard code that string in the expression, it's lightning fast.  I was not expecting this kind of a discrepancy at all, but I'm glad that I tested both approaches. I think that the most amazing thing here is difference in the 2 Perl tests.  If you use a scalar string variable as the regular expression, it's dog slow.  However, if you hard code that string in the expression, it's lightning fast.  I was not expecting this kind of a discrepancy at all, but I'm glad that I tested both approaches.
 +
 +Though I didn't include it in the //official// results, I thought it was kind of interesting that the compiled C program performed about the same as the Python program with pre-compiled regular expressions.
  
 I think the conclusion that I have to draw from this experiment is that Perl is your best choice, as is often the case, for a simple static regular expression based parser.  On the other hand, if you wanted a more dynamic approach to the regular expressions that you are using (like loading them in from a file, command-line, etc.), compiled Python is definitely your best answer, but PHP is also a good candidate.  It's pretty obvious that Perl is not the language to use in that particular case. I think the conclusion that I have to draw from this experiment is that Perl is your best choice, as is often the case, for a simple static regular expression based parser.  On the other hand, if you wanted a more dynamic approach to the regular expressions that you are using (like loading them in from a file, command-line, etc.), compiled Python is definitely your best answer, but PHP is also a good candidate.  It's pretty obvious that Perl is not the language to use in that particular case.
  
 Please, feel free to post to the discussion here in answer to this writeup. Please, feel free to post to the discussion here in answer to this writeup.
programming/general/phpvspythonvsperl.1201900074.txt.gz · Last modified: 2008/02/01 21:07 by crustymonkey