Efficient Learning for New OpenCart Developers

Posted by Paul on November 19, 2017

A programming language like PHP has a fairly large number of built-in functions (5000+) and in my experience (in teams and online) it seems to be the case that a lot of developers get by knowing just a small subset of the functions that are available to them - which is why so many articles like these exist: 7 little known but super useful PHP functions10 little known but useful PHP functions10 less known but useful PHP functions.

Of course it would be great to have an encyclopaedic knowledge of PHP and other languages, but in practise with Google always right there, many developers just Google as they go. But if you knew the most common functions off by heart wouldn't it make reading the source code much quicker and easier? With so many of them to learn though, where do you start?

Well if you're a developer looking to learn a new platform such as OpenCart, it ocurred to me recently that it might be an efficient way to learn by running some queries on the source code and see what are the most frequently used programming keywords and functions, and spend some time familiarizing yourself with those, what they do, their parameters and so on.

If you've come across my posts on the OpenCart forum before, you may well have seen me finding answers by "grepping" the OpenCart source code. So can the Linux command line help us identify which are the most frequently used PHP functions in a codebase such as OpenCart so that we can memorise them? Let's find out ...

First we need to know if we can identify the frequency of words from the command line and a bit of Googling gives us this:

cat file.txt | tr 'A-Z' 'a-z' | sed 's/--/ /g' | sed 's/[^a-z ]//g' | tr -s '[[:space:]]' '\n' | sort | uniq -c | sort -n | tail -n100 | sort -nr

Always have a good look over Linux commands you find online before running them - the above looks good, it just outputs ("cats") a file and passes it through tr, sed, sort, uniq and a tail command so it's unlikely to be able to do any harm. Running it on a file seems to give us a count of the most frequently used words from the text of the file - a good start!

Next we want to run this on the entire OpenCart source code. Unfortunately though this command will only take one file as input. Well let's put all of the OpenCart PHP source in to just one file, assuming we have a folder in the current directory called "upload" which contains the OpenCart source:

find upload/ -type f -name "*.php" -exec cat {} \; > oc.php

This command looks in the upload folder for files of type f (file) where the name is *.php and runs the "cat" command on each file, represented by "{}" which just outputs the content of the file. The backslash escapes the semi-colon which is needed to end the command. The greater than symbol (>) appends that output to a file we're going to call oc.php.

The resulting file is all of the PHP from OpenCart in one file - 11MB of it which translates to roughly 11 million characters. That's a lot of keystrokes OpenCart founder Daniel Kerr has made over the years!

So, let's run our original command on this file and see what we get ...

cat oc.php | tr 'A-Z' 'a-z' | sed 's/--/ /g' | sed 's/[^a-z ]//g' | tr -s '[[:space:]]' '\n' | sort | uniq -c | sort -n | tail -n30 | sort -nr

Output:

  18736 if                                                                 
  11975 function                                                            
   9692 array                                                               
   9562 return                                                              
   9075 public                                                               
   8877 the                                                                  
   6793 else                                                                 
   5411 true                                                                  
   4454 new                                                                    
   3963 to                                                                      
   3844 url                                                                     
   3626 as                                                                       
   3419 a                                                                         
   3406 dbprefix
   3253 value
   3164 and
   2963 is
   2848 of
   2812 null
   2764 false
   2732 thissessiondatausertoken
   2563 usertoken
   2530 this
   2525 not
   2515 class
   2365 from
   2349 foreach
   2343 use
   2293 param
   2247 php

Interesting, and to be expected, but not particularly useful at this point - the words that are appearing are very basic PHP keywords like if, else, function, class etc.

What we need is a way to filter this list on a known list of all PHP functions. Luckily the Linux command line is built for this sort of thing. The commands above make extensive use of the pipe character "|" and this deals with "streams" of text. So if we could just add an extra filter that filters out words not in a list of PHP functions then we'd end up with the info we're after. The grep command can do this for us:

grep -w -f php-functions.txt

Let's briefly mention the options: -w means in our input file (php-functions.txt) each PHP function must be on its own line - luckily it was easy to find a list of PHP functions online and it was easy to make them in to this list. The -f option means grep will filter based on the contents of a file - usually it filters based on text you pass to it as an argument.

So we need to add this grep command in the right place in our original command - let's try it like this:

cat oc.php | tr 'A-Z' 'a-z' | sed 's/--/ /g' | sed 's/[^a-z ]//g' | tr -s '[[:space:]]' '\n' | grep -w -f php-functions.txt | sort | uniq -c | sort -n | tail -n100 | sort -nr

Output:

   9692 array
   1874 file                                                                                                        
   1171 key                                                                                                         
    754 sort                                                                                                        
    482 implode                                                                                                     
    428 count                                                                                                       
    397 join
    392 link
    383 list
    374 date
    258 header
    207 log
    204 explode
    200 current
    183 time
    164 end
    161 attributes
    137 children
    135 delete
    128 echo
    123 mail
    120 max
    115 each
     95 empty
     77 dir
     70 next
     69 mktime
     69 defined
     66 copy
     63 min
     62 pos
     49 asin
     33 sprintf
     31 rand
     31 exit
     30 exp
     30 chr
     28 reset
     27 pow
     24 die
     22 range
     20 strtotime
     18 print
     16 unset
     16 trim
     16 setcookie
     16 getdate
     16 constant
     14 rewind
     13 sleep
     13 extract
      8 tmpfile
      8 gettype
      7 uniqid
      6 rename
      6 prev
      6 fopen
      6 filetype
      6 eval
      5 strpos
      5 realpath
      5 define
      4 xpath
      4 usleep
      4 strcmp
      4 stat
      4 scandir
      4 round
      4 pi
      4 ord
      4 isset
      4 glob
      3 tan
      3 serialize
      3 microtime
      3 htmlentities
      2 strtr
      2 mkdir
      2 floor
      2 dirname
      2 ceil
      1 unserialize
      1 unlink
      1 umask
      1 touch
      1 strtoupper
      1 strtolower
      1 strlen
      1 pclose
      1 pack
      1 fseek
      1 abs

This list looks good on first glance but looking closely it seems that although there are indeed PHP functions with all of these names, many are being picked up when it's not actually the functions being used. For example "asin" the arc sine? I'm not sure that would be a handy mathematical function in an ecommerce platform and indeed a look at where this is found in the source code shows that there is a variable called $asin.

So let's adjust our search to look for words that are followed by the left parenthesis character, "(" by adding another grep filter and adjusting our command so that the left parenthesis is left in:

cat oc.php | tr 'A-Z' 'a-z' | sed 's/--/ /g' | sed 's/[^a-z (]//g' | tr -s '[[:space:]]' '\n' | grep "^[^(]*($" | sort | uniq -c | grep -w -f php-functions.txt | sort -n | tail -n100 | sort -nr

We'll also adjust our php-functions.txt file to include the left bracket at the end.

Here's the output:

  8020 array(
    207 count(
    197 explode(
    161 implode(
     69 mktime(
     58 delete(
     48 time(
     36 date(
     32 sprintf(
     30 list(
     27 rand(
     25 pow(
     24 die(
     21 exit(
     20 strtotime(
     16 getdate(
     12 chr(
     10 max(
      9 sleep(
      8 min(
      7 uniqid(
      7 gettype(
      6 file(
      5 mail(
      5 join(
      4 reset(
      4 log(
      4 isset(
      3 usleep(
      3 next(
      3 microtime(
      3 key(
      3 echo(
      2 unset(
      2 strtr(
      2 serialize(
      2 fopen(
      2 end(
      2 current(
      2 copy(
      1 umask(
      1 strpos(
      1 strlen(
      1 setcookie(
      1 round(
      1 rewind(
      1 realpath(
      1 range(
      1 pclose(
      1 fseek(
      1 empty(
      1 constant(

That looks a lot better and the first thing that strikes me is that although our command will list the top 100 functions, only 52 are listed. If this is correct (and it may not be, I may have missed something obvious!) it means that OpenCart only uses 52 of the many PHP functions and it should be fairly easy for most developers to commit all of these to memory including their parameters, return types etc. and that would make working with OpenCart even easier.

The Top 5

1. array

At number 1 we have "array". No surprises here really, in PHP and many languages, arrays are used a lot so it's a good idea to get familiar with exactly how arrays work as well as all the various other array functions.

2. count

Simply counts the number of items in an array - very useful when looping over an array.

3. explode

A function that's actually included in the Antropy Web Developer test that we give to potential candidates, this simply takes a string and splits it in to an array based on a delimiter like a comma.

4. implode

The opposite of explode - give it an array and a separator and it will put that in to a string.

5. mktime

Gives you the incredibly useful Unix timestamp for a date based on the arguments given, used like this:
mktime( $hour, $minute, $second, $month, $day, $year, $is_dst );

Other Interesting Commonly Used Functions

pow

This function takes a number and adds an exponent, for example if you wanted to do two cubed, you'd do:
pow( 2, 3);

sleep

This seemed like an odd function to find in OpenCart because it pauses execution of the code for x number of seconds - why would you want that? A quick grep tells me this bit of code is actually only in OpenBayPro and the 3rd Party Guzzle library, so that makes sense.

Conclusion

I think the logic is sound that if you're looking to spend some time in the PHP docs familiarising yourself with various functions of the language, it makes sense to focus your time on the most commonly used ones as there are a lot of pretty esoteric and rarely used ones. I think it also makes sense to focus on the ones that are commonly used in PHP software that you're going to be working with, in this case OpenCart.

It's possible that the commands above have missed something and it's certainly the case that there are other ways of getting this same info - there are probably even very fancy IDEs that show it. Hopefully though the above is interesting and useful and will help those new to PHP and OpenCart use their time more efficiently.

What did you think? Do you know a better way to get this info? Let us know in the comments!

blog comments powered by Disqus