While chatting with some of the well known moders in MakeWebGames forums and chat, I discovered that even if all have some knowledge of SQL (as it is required to be able to store and retrieve all the information in a standard web site setup), most fails when we go to something yet a bit more special, like, using sub-queries, join or grouping functions. This couple of articles will be toward this group of people which could gain something by doing yet more complex queries. I will explain the what and why of all those, and hopefully you will be all up and running to do yourself more complex queries and dig into the tables with some stronger tools.
Before we start digging into complex queries we will have to setup a couple of tables where which will help us to build our examples. I will also store those in a SQL file so you can directly download them and run them inside your PHPMyAdmin.
http://engine.nowhere-else.org/article4_tables.txt
This should create a table PLAYER with about 60 entries inside.
Now that we have the table and the data, we can start to check a few things. The simplest query that everyone should know is retrieve a row given an ID:
SELECT * FROM PLAYER WHERE ID=1;
If we read this query, we can decode it like that, "take all the data from the table PLAYER where the column ID contains the value 1";
Of course this was an easy one, now if we want instead to retrieve all the rows where the username starts with the letter B the query will use a keyword "LIKE" instead of the equal sign:
SELECT * FROM PLAYER WHERE USERNAME LIKE 'b%';
If you run this query you will see it returns more than one row, and that the uppercase or lowercase doesn't matter to the condition. Again this is not a very complex query, the only interesting thing here is the use of % which is a wild character when you use the "LIKE" keyword. The underscore "_" match any single character and the percent "%" match any number of any characters. You can also do a "NOT LIKE" operation to reverse your query. Keep in mind that "LIKE" is not a fast operation as MySQL needs to go inside the field an analyze it.
Now what if we want to count how many players reached at least level 20?
SELECT COUNT(*) FROM PLAYER WHERE LEVEL >= 20;
And here you see your first "GROUPING" function, as instead of returning rows MySQL will start to do some work with the result as it needs to count the number of rows.
We can work further in this direction and ask to count the number of players grouped by the first letter of their username:
SELECT LEFT(USERNAME,1), COUNT(*) FROM PLAYER GROUP BY LEFT(USERNAME,1);
As you see, we use a string function here, to read the first character out of the username (LEFT) then the now known COUNT(*) to count the number of rows and suddenly at the end we find a "GROUP BY" instruction. This is needed if you don't use ALL grouping functions in the SELECT statement. As LEFT is not a grouping function we need to give the instruction to MySQL to say how it must be grouped.
When you run this last query you will see that the column name of the result are basically generated out of the functions we use, however it may become odd to use such names in sub queries (something we will present here after), we need therefore to introduce column aliases:
SELECT LEFT(USERNAME,1) STARTWITH, COUNT(*) NB FROM PLAYER GROUP BY LEFT(USERNAME,1);
The names "STARTWITH" and "NB" will be used as columns name for the result table. Now, MySQL is not really strict on the syntax here, with some other databases you may need to use the keyword AS or put the alias name in double quotes.
Now what is a sub query? Well let's start with this example, we want to know all the username of the players which are in a group of those username where more than 4 players uses the same starting letter. Make sense? Well basically we want to retrieve all users width start the very common starting letter. How useful it is in real life I honestly don't know, but it gives you a first view of what a sub query can offer.
SELECT * FROM PLAYER
WHERE LEFT(USERNAME,1) IN
(SELECT STARTWITH FROM (SELECT LEFT(USERNAME,1) STARTWITH, COUNT(*) NB FROM PLAYER
GROUP BY LEFT(USERNAME,1)) SUMIT WHERE NB > 4);
Now you see it starts to get a bit more complex. What happens here is that we first need to extract which letters are mostly used, then select it in another query which are used by more than 4 players and finally retrieve all those which use those starting letters.
The syntax is the following: SELECT * FROM (SUBQUERY) LOGICALNAME
Where SUBQUERY is any kind of query and LOGICALNAME is a name you give to your pseudo table.
You see we used also the "IN" keyword, this tells to retrieve all rows where a column is within a given list. The list could be a coma separated list of values like (1,2,3,4) or defined by a sub query in which case the sub query MUST return only 1 column.
There is still something I want to explain before finishing with a nice query, and this is the JOIN feature. Normally when you want to "link" two tables together you do something like
SELECT A.*, B.* FROM A, B WHERE A.ID=B.A_ID;
Where the A_ID is the link between the two tables. This will work but will retrieve only the rows where there is a match. What if you want to retrieve also the rows of A, even if there is nothing in B? Then you must start to use the LEFT or RIGHT joins:
SELECT A.*, B.* FROM A LEFT JOIN B ON (A.ID=B.A_ID);
Now MySQL will retrieve all rows of A, with the data of B in case there is some.
With this knowledge we can now see the final mighty query of this tutorial:
SELECT PLAYER.USERNAME,S.NB NBTOT,A.NB NB10,B.NB NB30 FROM PLAYER,
(SELECT REFEREE REF, COUNT(*) NB FROM PLAYER WHERE REFEREE <> 0 GROUP BY REFEREE) AS S LEFT JOIN
(SELECT REFEREE REF, COUNT(*) NB FROM PLAYER WHERE REFEREE <> 0 AND LEVEL >= 10 GROUP BY REFEREE) AS A ON S.REF=A.REF LEFT JOIN
(SELECT REFEREE REF, COUNT(*) NB FROM PLAYER WHERE REFEREE <> 0 AND LEVEL >= 30 GROUP BY REFEREE) AS B ON A.REF = B.REF
WHERE S.REF=PLAYER.ID ORDER BY S.NB DESC
This query retrieve all players which have referred players, and count how many referred players they have, and the how many over level 10 and finally how many over level 30. This could be a nice report for you admin, and this is done with a single query.
The reason to do such complex queries is to avoid having multiple queries running to retrieve the data. Most of the time a single query even if a bit more complex is faster over a loop in PHP doing queries one after the other.
|