Thursday, July 5, 2012

Interlude 6 - Lua Tables (Updated 23/01/16)

Interlude 6.0 Acknowledgements

Many thanks to the good folks over at the Codea Forum for their comments and advice regarding Lua tables. I have updated this Interlude based on their input. In particular, thank you to Andrew_Stacey,gunnar_z and emsi.

Interlude 6.1 The Table Data Structure

The table in Lua is a fascinating object, not least because it is the only data structure in Lua. It is very flexible and very powerful and if you are used to array, dictionary and matrix data structures in other languages, it can be tricky to get your head around initially. We need tables for the next tutorial so it seems a good time to take a quick detour to understand how they can be implemented.

A table is a collection of key-value pairs. Keys and values can be any type other than nil. A key is often called an index to a table. Like global variables, table fields evaluate to nil if they are not initialized. Also like global variables, you can assign nil to a table field to delete it. Because tables can contain data and functions they can also be used to implement an object.

Tables have no fixed size in Lua and you can dynamically add as many elements to a table as you like (memory permitting).

Something that wasn't clear to us initially was that tables can be thought of consisting of two parts: the array part and the hash part. If you use integer keys in the range 1 to n, then these entries will be stored in the array part. All other elements get stuck in the hash part. While you don't need to understand this to use tables, it does explain some of the practices outlined below. For those who would like a bit more detail, emsi pointed out an article on optimising Lua code which has an interesting section on tables.

Interlude 6.2 Creating Tables - the array part

You create a table in Lua using the construction expression. The simplest version of this is:

mTable = {}        -- At this stage both the array and hash parts of the table are empty.

If you just want a one dimensional array data structure use integers as the key and start at 1.

for i = 1, 100 do
    mTable[i] = i

This data will be placed in the array part of mTable, which as we will demonstrate below brings with it some useful benefits.

As Andrew_Stacey rightly pointed out, when you create a table with no keys, as in

mTable = {"a","b"}

then the keys start at 1 and increment so this is the same as

mTable = {1 = "a", 2 = "b"}

It is traditional in Lua to commence table indexes at 1, rather than 0 as found in the c based languages. See Interlude 6.4 for why you should follow this convention.

Interlude 6.3 Creating Tables - the hash part

The hash part of the table is used if your key is anything but a non zero integer. For example:

mTable = {["One"] = "Number 1"}
mTable["Two"] = "Number 2"
mTable.Three = "Number 3"

You can mix index types in the one table, but you will then start using the hash part of the table. For example:

mTable["x"] = 99 is perfectly valid addition to the array example in Interlude 6.2. Note that mTable[1] and mTable["1"] are not the same but equally valid members of mTable. When you are starting out, this can cause subtle bugs if you aren't careful.

Lua also allows dot notation which is an alternate syntactic structure that some people prefer (and some hate!). 

mTable.x = 99 is equivalent to mTable["x"] = 99.

gunnar_z provides theses additional examples:

"Table keys that are valid identifiers may always be accessed using dot notation, regardless of how they were initialized. So, if you do any of these:

a = {} 
a.x = 1

a = {} 
a['x'] = 1

idx = 'x' 
a = {} 
a[idx] = 1 

a = { ['x'] = 1 } 

idx = 'x' 
a = { [idx] = 1 } a = { x = 1 }

a.x always equals a['x'], both are 1 in this case."

Interlude 6.4 Iterating over Your Table - ipairs and pairs

Lua includes a handy function called ipairs to iterate over the array part of your table. The ipairs() iterator iterates, in numeric order, all elements with positive integer keys, from 1 until the first nonexistant or nil-valued key is encountered. Consequently, ipairs will not return an element with a key of 0. This is a good reason to start your arrays at an index of 1. You use ipairs in the following fashion:

for key, value in ipairs(mTable) do
    print("Array index: " .. key .. "and value: " ..value)

This statement will loop in order through each key-value pair in mTable, assigning the key and value to the loop variables. This structure is often called an ipairs loop. If you aren't going to use one of the loop variables it is traditional to name it with an underscore (_). You will see this in a number of the example projects included with Codea.

If you mix in non-integer keys (e.g. strings) in your table, then ipairs will simply ignore these. 
What about if you want to iterate over your whole table, irrespective of whether elements are in the array or hash part? In this case you need to use pairs(). This iterator is guaranteed to iterate over every key of every kind as long as it has a non-nil value. However, it doesn't iterate in any particular order.

Andrew_Stacey and gunnar_z suggest that "when using tables as arrays, with sequential numeric indices, you should be using ipairs instead of pairs. This will ensure that you only get the key-value pairs with numeric keys, and also they will be ordered. pairs will return all key-value pairs, and an order on the keys, even the numeric ones, is not guaranteed."

Interlude 6.5 Array Length

Another approach to iterating through the array portion of your table is to use the # (length) operator.

mArray = {}

print (#mArray)

Will return 0.

mArray = {"a"}

print (#mArray)

Will return 1.

mArray = {"a", "b"}

print (#mArray)

Will return 2 - and you are probably starting to get the idea. Note that arrays with gaps don't work with the length operator as it uses nil to detect the end of the array. For tables without gaps, you can iterate through them as follows:

for i = 1, #mArray do

gunnar_z's description of the length operator was so good that we thought we would reproduce it here:

"Basically, what # returns is not the length of the table or the index of the last element, but the numeric index of an element such that, for a given table T, T[#T] ~= nil and T[#T+1] == nil. With the sole exception of a table with no numeric keys, in which case the above condition does not hold for the returned value of 0. But, for a table
T = { [1] = 1, [3] = 2, [5] = 3 }
the value of #T may be any of 1, 3 or 5."

Interlude 6.6 Enter the Matrix - Two (and more) Dimensional Arrays

It is often handy to use a two dimensional array or Matrix to represent a grid for your game. In fact we will be using this construct in the next tutorial. The easiest way to do this is to have an array of an array (or a table of a table to be strictly correct). That is, each element of your one dimensional array is another array.

-- create a 2D matrix with N rows and M columns, 
-- each element initialised to 0.

mMatrix = {}
    for i = 1, N do 
        mMatrix[i] = {}                             -- create a new row 
        for j = 1, M do 
            mMatrix[i][j] = 0 

To create additional dimensions you just nest additional arrays.

Interlude 6.7 Dictionary like Structures  - back to hash

You can use the table constructor {} to assign different types of keys to values. For example, say we wanted a data structure which associates Capital Cities with States, we could do something like:

CapitalOfState = {["NSW"] = "Sydney",
["VIC"] = "Melbourne",
["QLD"] = "Brisbane",
["SA"] = "Adelaide",
["WA"] = "Perth",
["TAS"] = "Hobart"}

Note that ACT and NT are not States but Territories! We mentioned key-value pairs earlier. In this example we are associating the key "NSW" with the value "Sydney". To retrieve a value for a particular key you would use CapitalOfState["NSW"] which will return "Sydney".

If your keys are valid identifiers (i.e. one word using letters and no symbols) you can construct your table without using the [] and "". It would then look like:

CapitalOfState = {NSW = "Sydney",
VIC = "Melbourne",
QLD = "Brisbane",
SA = "Adelaide",
WA = "Perth",
TAS = "Hobart"}

As noted earlier, table keys that are valid identifiers may be accessed using dot notation  instead of square brackets and quotes:

CapitalOfState.NSW will return "Sydney".

This concludes our detour down Lua tables, next we will look at how we use these data structures in anger.