Zend Hash API
The Zend Hash API is split into a few basic categories andwith a couple exceptionsthe functions in these categories will generally return either SUCCESS or FAILURE.
Creation
Every HashTable is initialized by a common constructor:
int zend_hash_init(HashTable *ht, uint nSize, hash_func_t pHashFunction, dtor_func_t pDestructor, zend_bool persistent)
ht is a pointer to a HashTable variable either declared as an immediate value, or dynamically allocated via emalloc(), pemalloc(), or more commonly ALLOC_HASHTABLE(ht). The ALLOC_HASHTABLE() macro uses pre-sized blocks of memory from a special pool to speed the allocation time required and is generally preferred over ht = emalloc(sizeof(HashTable));.
nSize should be set to the maximum number of elements that the HashTable is expected to hold. If more that this number of elements are added to the HashTable, it will be able to grow but only at a noticeable cost in processing time as Zend reindexes the entire table for the newly widened structure. If nSize is not a power of 2, it will be automatically enlarged to the next higher power according to the formula
nSize = pow(2, ceil(log(nSize, 2)));
pHashFunction is a holdover from an earlier version of the Zend Engine and is no longer used so this value should always be set to NULL. In earlier versions of the Zend Engine, this value could be pointed to an alternate hashing algorithm to be used in place of the standard DJBX33A methoda quick, moderately collision-resistant hashing algorithm for converting arbitrary string keys into reproducible integer values.
pDestructor is a pointer to a method to be called whenever an element is removed from a HashTable such as when using zend_hash_del() or replacing an item with zend_hash_update(). The prototype for any destructor method must be
void method_name(void *pElement);
where pElement is a pointer to the item being removed from the HashTable.
The final option, persistent, is a simple flag that the engine passes on to the pemalloc() function calls you were introduced to in Chapter 3, "Memory Management." Any HashTables that need to remain available between requests must have this flag set and must have been allocated using pemalloc().
This method can be seen in use at the start of every PHP request cycle as the EG(symbol_table) global is initialized:
zend_hash_init(&EG(symbol_table), 50, NULL, ZVAL_PTR_DTOR, 0);
As you can see here, when an item is removed from the symbol tablepossibly in response to an unset($foo); statementa pointer to the zval* stored in the HashTable (effectively a zval**) is sent to zval_ptr_dtor(), which is what the ZVAL_PTR_DTOR macro expands out to.
Because 50 is not an exact power of 2, the size of the initial global symbol table will actually be 64the next higher power of 2.
Population
There are four primary functions for inserting or updating data in HashTables:
int zend_hash_add(HashTable *ht, char *arKey, uint nKeyLen, void **pData, uint nDataSize, void *pDest); int zend_hash_update(HashTable *ht, char *arKey, uint nKeyLen, void *pData, uint nDataSize, void **pDest); int zend_hash_index_update(HashTable *ht, ulong h, void *pData, uint nDataSize, void **pDest); int zend_hash_next_index_insert(HashTable *ht, void *pData, uint nDataSize, void **pDest);
The first two functions here are for adding associatively indexed data to a HashTable such as with the statement $foo['bar'] = 'baz'; which in C would look something like:
zend_hash_add(fooHashTbl, "bar", sizeof("bar"), &barZval, sizeof(zval*), NULL);
The only difference between zend_hash_add() and zend_hash_update() is that zend_hash_add() will fail if the key already exists.
The next two functions deal with numerically indexed HashTables in a similar manner. This time, the distinction between the two lies in whether a specific index is provided, or if the next available index is assigned automatically.
If it's necessary to store the index value of the element being inserted using zend_hash_next_index_insert(), then the zend_hash_next_free_element() function may be used:
ulong nextid = zend_hash_next_free_element(ht); zend_hash_index_update(ht, nextid, &data, sizeof(data), NULL);
In the case of each of these insertion and update functions, if a value is passed for pDest, the void* data element that pDest points to will be populated by a pointer to the copied data value. This parameter has the same usage (and result) as the pData parameter passed to the zend_hash_find() function you're about to look at.
Recall
Because there are two distinct organizations to HashTable indices, there must be two methods for extracting them:
int zend_hash_find(HashTable *ht, char *arKey, uint nKeyLength, void **pData); int zend_hash_index_find(HashTable *ht, ulong h, void **pData);
As you can guess, the first is for associatively indexed arrays while the second is for numerically indexed ones. Recall from Chapter 2 that when data is added to a HashTable, a new memory block is allocated for it and the data passed in is copied; when the data is extracted back out it is the pointer to that data which is returned. The following code fragment adds data1 to the HashTable, and then extracts it back out such that at the end of the routine, *data2 contains the same contents as *data1 even though the pointers refer to different memory addresses.
void hash_sample(HashTable *ht, sample_data *data1) { sample_data *data2; ulong targetID = zend_hash_next_free_element(ht); if (zend_hash_index_update(ht, targetID, data1, sizeof(sample_data), NULL) == FAILURE) { /* Should never happen */ return; } if(zend_hash_index_find(ht, targetID, (void **)&data2) == FAILURE) { /* Very unlikely since we just added this element */ return; } /* data1 != data2, however *data1 == *data2 */ }
Often, retrieving the stored data is not as important as knowing that it exists; for this purpose two more functions exist:
int zend_hash_exists(HashTable *ht, char *arKey, uint nKeyLen); int zend_hash_index_exists(HashTable *ht, ulong h);
These two methods do no return SUCCESS/FAILURE; rather they return 1 to indicate that the requested key/index exists or 0 to indicate absence. The following code fragment performs roughly the equivalent of isset($foo):
if (zend_hash_exists(EG(active_symbol_table), "foo", sizeof("foo"))) { /* $foo is set */ } else { /* $foo does not exist */ }
Quick Population and Recall
ulong zend_get_hash_value(char *arKey, uint nKeyLen);
When performing multiple operations with the same associative key, it can be useful to precompute the hash using zend_get_hash_value(). The result can then be passed to a collection of "quick" functions that behave exactly like their non-quick counterparts, but use the precomputed hash value rather than recalculating it each time.
int zend_hash_quick_add(HashTable *ht, char *arKey, uint nKeyLen, ulong hashval, void *pData, uint nDataSize, void **pDest); int zend_hash_quick_update(HashTable *ht, char *arKey, uint nKeyLen, ulong hashval, void *pData, uint nDataSize, void **pDest); int zend_hash_quick_find(HashTable *ht, char *arKey, uint nKeyLen, ulong hashval, void **pData); int zend_hash_quick_exists(HashTable *ht, char *arKey, uint nKeyLen, ulong hashval);
Surprisingly there is no zend_hash_quick_del(). The "quick" hash functions might be used in something like the following code fragment, which copies a specific element from hta to htb, which are zval* HashTables:
void php_sample_hash_copy(HashTable *hta, HashTable *htb, char *arKey, uint nKeyLen TSRMLS_DC) { ulong hashval = zend_get_hash_value(arKey, nKeyLen); zval **copyval; if (zend_hash_quick_find(hta, arKey, nKeyLen, hashval, (void**)©val) == FAILURE) { /* arKey doesn't actually exist */ return; } /* The zval* is about to be owned by another hash table */ (*copyval)->refcount++; zend_hash_quick_update(htb, arKey, nKeyLen, hashval, copyval, sizeof(zval*), NULL); }
Copying and Merging
The previous task, duplicating an element from one HashTable to another, is extremely common and is often done en masse. To avoid the headache and trouble of repeated recall and population cycles, there exist three helper methods:
typedef void (*copy_ctor_func_t)(void *pElement); void zend_hash_copy(HashTable *target, HashTable *source, copy_ctor_func_t pCopyConstructor, void *tmp, uint size);
Every element in source will be copied to target and then processed through the pCopyConstructor function. For HashTables such as userspace variable arrays, this provides the opportunity to increment the reference count so that when the zval* is removed from one or the other HashTable, it's not prematurely destroyed. If the same element already exists in the target HashTable, it is overwritten by the new element. Other existing elementsthose not being overwrittenare not implicitly removed.
tmp should be a pointer to an area of scratch memory to be used by the zend_hash_copy() function while it's executing. Ever since PHP 4.0.3, however, this temporary space is no longer used. If you know your extension will never be compiled against a version older than 4.0.3, just leave this NULL.
size is the number of bytes occupied by each member element. In the case of a userspace variable hash, this would be sizeof(zval*).
void zend_hash_merge(HashTable *target, HashTable *source, copy_ctor_func_t pCopyConstructor, void *tmp, uint size, int overwrite);
zend_hash_merge() differs from zend_hash_copy() only in the addition of the overwrite parameter. When set to a non-zero value, zend_hash_merge() behaves exactly like zend_hash_copy(); when set to zero, it skips any already existing elements.
typedef zend_bool (*merge_checker_func_t)(HashTable *target_ht, void *source_data, zend_hash_key *hash_key, void *pParam); void zend_hash_merge_ex(HashTable *target, HashTable *source, copy_ctor_func_t pCopyConstructor, uint size, merge_checker_func_t pMergeSource, void *pParam);
The final form of this group of functions allows for selective copying using a merge checker function. The following example shows zend_hash_merge_ex() in use to copy only the associatively indexed members of the source HashTable (which happens to be a userspace variable array):
zend_bool associative_only(HashTable *ht, void *pData, zend_hash_key *hash_key, void *pParam) { /* True if there's a key, false if there's not */ return (hash_key->arKey && hash_key->nKeyLength); } void merge_associative(HashTable *target, HashTable *source) { zend_hash_merge_ex(target, source, zval_add_ref, sizeof(zval*), associative_only, NULL); }
Iteration by Hash Apply
Like in userspace, there's more than one way to iterate a cater...array. The first, and generally easiest, method is using a callback system similar in function to the foreach() construct in userspace. This two part system involves a callback function you'll writewhich acts like the code nest in a foreach loopand a call to one of the three hash application API functions.
typedef int (*apply_func_t)(void *pDest TSRMLS_DC); void zend_hash_apply(HashTable *ht, apply_func_t apply_func TSRMLS_DC);
This simplest form of the hash apply family simply iterates through ht calling apply_func for each one with a pointer to the current element passed in pDest.
typedef int (*apply_func_arg_t)(void *pDest, void *argument TSRMLS_DC); void zend_hash_apply_with_argument(HashTable *ht, apply_func_arg_t apply_func, void *data TSRMLS_DC);
In this next hash apply form, an arbitrary argument is passed along with the hash element. This is useful for multipurpose hash apply functions where behavior can be customized depending on an additional parameter.
Each callback function, no matter which iterator function it applies to, expects one of the three possible return values shown in Table 8.1.
Constant |
Meaning |
---|---|
ZEND_HASH_APPLY_KEEP |
Returning this value completes the current loop and continues with the next value in the subject hash table. This is equivalent to issuing continue; within a foreach() control block. |
ZEND_HASH_APPLY_STOP |
This return value halts iteration through the subject hash table and is the same as issuing break; within a foreach() loop. |
ZEND_HASH_APPLY_REMOVE |
Similar to ZEND_HASH_APPLY_KEEP, this return value will jump to the next iteration of the hash apply loop. However, this return value will also delete the current element from the subject hash. |
A simple foreach() loop in userspace such as the following:
would translate into the following callback in C:
int php_sample_print_zval(zval **val TSRMLS_DC) { /* Duplicate the zval so that * the original's contents are not destroyed */ zval tmpcopy = **val; zval_copy_ctor(&tmpcopy); /* Reset refcount & Convert */ INIT_PZVAL(&tmpcopy); convert_to_string(&tmpcopy); /* Output */ php_printf("The value is: "); PHPWRITE(Z_STRVAL(tmpcopy), Z_STRLEN(tmpcopy)); php_printf(" "); /* Toss out old copy */ zval_dtor(&tmpcopy); /* continue; */ return ZEND_HASH_APPLY_KEEP; }
which would be iterated using
zend_hash_apply(arrht, php_sample_print_zval TSRMLS_CC);
Note
Recall that when variables are stored in a hash table, only a pointer to the zval is actually copied; the contents of the zval are never touched by the HashTable itself. Your iterator callback prepares for this by declaring itself to accept a zval** even though the function type only calls for a single level of indirection. Refer to Chapter 2 for more information on why this is done.
typedef int (*apply_func_args_t)(void *pDest, int num_args, va_list args, zend_hash_key *hash_key); void zend_hash_apply_with_arguments(HashTable *ht, apply_func_args_t apply_func, int numargs, ...);
In order to receive the key during loops as well as the value, the third form of zend_hash_apply() must be used. For example, if you extended this exercise to support outputting the key:
$val) { echo "The value of $key is: $val "; } ?>
then your current iterator callback would have nowhere to get $key from. By switching to zend_hash_apply_with_arguments(), however, your callback prototype and implementation now becomes
int php_sample_print_zval_and_key(zval **val, int num_args, va_list args, zend_hash_key *hash_key) { /* Duplicate the zval so that * the original's contents are not destroyed */ zval tmpcopy = **val; /* tsrm_ls is needed by output functions */ TSRMLS_FETCH(); zval_copy_ctor(&tmpcopy); /* Reset refcount & Convert */ INIT_PZVAL(&tmpcopy); convert_to_string(&tmpcopy); /* Output */ php_printf("The value of "); if (hash_key->nKeyLength) { /* String Key / Associative */ PHPWRITE(hash_key->arKey, hash_key->nKeyLength); } else { /* Numeric Key */ php_printf("%ld", hash_key->h); } php_printf(" is: "); PHPWRITE(Z_STRVAL(tmpcopy), Z_STRLEN(tmpcopy)); php_printf(" "); /* Toss out old copy */ zval_dtor(&tmpcopy); /* continue; */ return ZEND_HASH_APPLY_KEEP; }
Which can then be called as:
zend_hash_apply_with_arguments(arrht, php_sample_print_zval_and_key, 0);
Note
This particular example required no arguments to be passed; for information on extracting variable argument lists from va_list args, see the POSIX documentation pages for va_start(), va_arg(), and va_end().
Notice that nKeyLength, rather than arKey, was used to test for whether the key was associative or not. This is because implementation specifics in Zend HashTables can sometimes leave data in the arKey variable. nKeyLength, however, can be safely used even for empty keys (for example, $foo[''] ="Bar";) because the trailing NULL is included giving the key a length of 1.
Iteration by Move Forward
It's also trivially possible to iterate through a HashTable without using a callback. For this, you'll need to be reminded of an often ignored concept in HashTables: The internal pointer.
In userspace, the functions reset(), key(), current(), next(), prev(), each(), and end() can be used to access elements within an array depending on where an invisible bookmark believes the "current" position to be:
1, 'b'=>2, 'c'=>3); reset($arr); while (list($key, $val) = each($arr)) { /* Do something with $key and $val */ } reset($arr); $firstkey = key($arr); $firstval = current($arr); $bval = next($arr); $cval = next($arr); ?>
Each of these functions is duplicated bymore to the point, wrapped aroundinternal Zend Hash API functions with similar names:
/* reset() */ void zend_hash_internal_pointer_reset(HashTable *ht); /* key() */ int zend_hash_get_current_key(HashTable *ht, char **strIdx, unit *strIdxLen, ulong *numIdx, zend_bool duplicate); /* current() */ int zend_hash_get_current_data(HashTable *ht, void **pData); /* next()/each() */ int zend_hash_move_forward(HashTable *ht); /* prev() */ int zend_hash_move_backwards(HashTable *ht); /* end() */ void zend_hash_internal_pointer_end(HashTable *ht); /* Other... */ int zend_hash_get_current_key_type(HashTable *ht); int zend_hash_has_more_elements(HashTable *ht);
Note
The next(), prev(), and end() userspace statements actually map to their move forward/backward statements followed by a call to zend_hash_get_current_data(). each() performs the same steps as next(), but calls and returns zend_hash_get_current_key() as well.
Emulating a foreach() loop using iteration by moving forward actually starts to look more familiar, repeating the print_zval_and_key example from earlier:
void php_sample_print_var_hash(HashTable *arrht) { for(zend_hash_internal_pointer_reset(arrht); zend_hash_has_more_elements(arrht) == SUCCESS; zend_hash_move_forward(arrht)) { char *key; uint keylen; ulong idx; int type; zval **ppzval, tmpcopy; type = zend_hash_get_current_key_ex(arrht, &key, &keylen, &idx, 0, NULL); if (zend_hash_get_current_data(arrht, (void**)&ppzval) == FAILURE) { /* Should never actually fail * since the key is known to exist. */ continue; } /* Duplicate the zval so that * the orignal's contents are not destroyed */ tmpcopy = **ppzval; zval_copy_ctor(&tmpcopy); /* Reset refcount & Convert */ INIT_PZVAL(&tmpcopy); convert_to_string(&tmpcopy); /* Output */ php_printf("The value of "); if (type == HASH_KEY_IS_STRING) { /* String Key / Associative */ PHPWRITE(key, keylen); } else { /* Numeric Key */ php_printf("%ld", idx); } php_printf(" is: "); PHPWRITE(Z_STRVAL(tmpcopy), Z_STRLEN(tmpcopy)); php_printf(" "); /* Toss out old copy */ zval_dtor(&tmpcopy); } }
Most of this code fragment should be immediately familiar to you. The one item that hasn't yet been touched on is zend_hash_get_current_key()'s return value. When called, this function will return one of three constants as listed in Table 8.2.
Constant |
Meaning |
---|---|
HASH_KEY_IS_STRING |
The current element is associatively indexed; therefore, a pointer to the element's key name will be populated into strIdx, and its length will be populated into stdIdxLen. If the duplicate flag is set to a nonzero value, the key will be estrndup()'d before being populated into strIdx. The calling application is expected to free this duplicated string. |
HASH_KEY_IS_LONG |
The current element is numerically indexed and numIdx will be supplied with the index number. |
HASH_KEY_NON_EXISTANT |
The internal pointer is past the end of the HashTable's contents. Neither a key nor a data value are available at this position because no more exist. |
Preserving the Internal Pointer
When iterating through a HashTable, particularly one containing userspace variables, it's not uncommon to encounter circular references, or at least self-overlapping loops. If one iteration context starts looping through a HashTable and the internal pointer reachesfor examplethe halfway mark, a subordinate iterator starts looping through the same HashTable and would obliterate the current internal pointer position, leaving the HashTable at the end when it arrived back at the first loop.
The way this is resolvedboth within the zend_hash_apply implementation and within custom move forward usesis to supply an external pointer in the form of a HashPosition variable.
Each of the zend_hash_*() functions listed previously has a zend_hash_*_ex() counterpart that accepts one additional parameter in the form of a pointer to a HashPostion data type. Because the HashPosition variable is seldom used outside of a short-lived iteration loop, it's sufficient to declare it as an immediate variable. You can then dereference it on usage such as in the following variation on the php_sample_print_var_hash() function you saw earlier:
void php_sample_print_var_hash(HashTable *arrht) { HashPosition pos; for(zend_hash_internal_pointer_reset_ex(arrht, &pos); zend_hash_has_more_elements_ex(arrht, &pos) == SUCCESS; zend_hash_move_forward_ex(arrht, &pos)) { char *key; uint keylen; ulong idx; int type; zval **ppzval, tmpcopy; type = zend_hash_get_current_key_ex(arrht, &key, &keylen, &idx, 0, &pos); if (zend_hash_get_current_data_ex(arrht, (void**)&ppzval, &pos) == FAILURE) { /* Should never actually fail * since the key is known to exist. */ continue; } /* Duplicate the zval so that * the original's contents are not destroyed */ tmpcopy = **ppzval; zval_copy_ctor(&tmpcopy); /* Reset refcount & Convert */ INIT_PZVAL(&tmpcopy); convert_to_string(&tmpcopy); /* Output */ php_printf("The value of "); if (type == HASH_KEY_IS_STRING) { /* String Key / Associative */ PHPWRITE(key, keylen); } else { /* Numeric Key */ php_printf("%ld", idx); } php_printf(" is: "); PHPWRITE(Z_STRVAL(tmpcopy), Z_STRLEN(tmpcopy)); php_printf(" "); /* Toss out old copy */ zval_dtor(&tmpcopy); } }
With these very slight additions, the HashTable's true internal pointer is preserved in whatever state it was initially in on entering the function. When it comes to working with internal pointers of userspace variable HashTables (that is, arrays), this extra step will very likely make the difference between whether the scripter's code works as expected.
Destruction
There are only four destruction functions you need to worry about. The first two are used for removing individual elements from a HashTable:
int zend_hash_del(HashTable *ht, char *arKey, uint nKeyLen); int zend_hash_index_del(HashTable *ht, ulong h);
As you can guess, these cover a HashTable's split-personality index design by providing deletion functions for both associative and numerically indexed hash elements. Each version returns either SUCCESS or FAILURE.
Recall that when an item is removed from a HashTable, the HashTable's destructor function is called with a pointer to the item to be removed passed as the only parameter.
void zend_hash_clean(HashTable *ht);
When completely emptying out a HashTable, the quickest method is to call zend_hash_clean(), which will iterate through every element calling zend_hash_del() on them one at a time.
void zend_hash_destroy(HashTable *ht);
Usually, when cleaning out a HashTable, you'll want to discard it entirely. Calling zend_hash_destroy() will perform all the actions of a zend_hash_clean(), as well as free additional structures allocated during zend_hash_init().
A full HashTable life cycle might look like the following:
int sample_strvec_handler(int argc, char **argv TSRMLS_DC) { HashTable *ht; /* Allocate a block of memory * for the HashTable structure */ ALLOC_HASHTABLE(ht); /* Initialize its internal state */ if (zend_hash_init(ht, argc, NULL, ZVAL_PTR_DTOR, 0) == FAILURE) { FREE_HASHTABLE(ht); return FAILURE; } /* Populate each string into a zval* */ while (argc) { zval *value; MAKE_STD_ZVAL(value); ZVAL_STRING(value, argv[argc], 1); argv++; if (zend_hash_next_index_insert(ht, (void**)&value, sizeof(zval*)) == FAILURE) { /* Silently skip failed additions */ zval_ptr_dtor(&value); } } /* Do some work */ process_hashtable(ht); /* Destroy the hashtable * freeing all zval allocations as necessary */ zend_hash_destroy(ht); /* Free the HashTable itself */ FREE_HASHTABLE(ht); return SUCCESS; }
Sorting, Comparing, and Going to the Extreme(s)
A couple more callbacks exist in the Zend Hash API. The first handles comparing two elements either from the same HashTable, or from similar positions in different HashTables:
typedef int (*compare_func_t)(void *a, void *b TSRMLS_DC);
Like the usort() callback in userspace PHP, this function expects you to compare the values of a and b. Using your own criteria for comparison, return either -1 if a is less than b, 1 if b is less than a, or 0 if they are equal.
int zend_hash_minmax(HashTable *ht, compare_func_t compar, int flag, void **pData TSRMLS_DC);
The simplest API function to use this callback is zend_hash_minmax(), whichas the name implieswill return the highest or lowest valued element from a HashTable based on the ultimate result of multiple calls to the comparison callback. Passing zero for flag will return the minimum value; passing non-zero will return maximum.
The following example sorts the list of registered userspace functions by name and returns the lowest and highest named function (not case-sensitive):
int fname_compare(zend_function *a, zend_function *b TSRMLS_DC) { return strcasecmp(a->common.function_name, b->common.function_name); } void php_sample_funcname_sort(TSRMLS_D) { zend_function *fe; if (zend_hash_minmax(EG(function_table), fname_compare, 0, (void **)&fe) == SUCCESS) { php_printf("Min function: %s ", fe->common.function_name); } if (zend_hash_minmax(EG(function_table), fname_compare, 1, (void **)&fe) == SUCCESS) { php_printf("Max function: %s ", fe->common.function_name); } }
The hash comparison function is also used in zend_hash_compare(), which evaluates two hashes against each other as a whole. If hta is found to be "greater" than htb, 1 will be returned. -1 is returned if htb is "greater" than hta, and 0 if they are deemed equal.
int zend_hash_compare(HashTable *hta, HashTable *htb, compare_func_t compar, zend_bool ordered TSRMLS_DC);
This method begins by comparing the number of elements in each HashTable. If one HashTable contains more elements than the other, it is immediately deemed greater and the function returns quickly.
Next it starts a loop with the first element of hta. If the ordered flag is set, it compares keys/indices with the first element of htbstring keys are compared first on length, and then on binary sequence using memcmp(). If the keys are equal, the value of the element is compared with the first element of htb using the comparison callback function.
If the ordered flag is not set, the data portion of the first element of hta is compared against the element with a matching key/index in htb using the comparison callback function. If no matching element can be found for htb, then hta is considered greater than htb and 1 is returned.
If at the end of a given loop, hta and htb are still considered equal, comparison continues with the next element of hta until a difference is found or all elements have been exhausted, in which case 0 is returned.
The second callback function in this family is the sort function:
typedef void (*sort_func_t)(void **Buckets, size_t numBuckets, size_t sizBucket, compare_func_t comp TSRMLS_DC);
This callback will be triggered once, and receive a vector of all the Buckets (elements) in the HashTable as a series of pointers. These Buckets may be swapped around within the vector according to the sort function's own logic with or without the use of the comparison callback. In practice, sizBucket will always be sizeof(Bucket*).
Unless you plan on implementing your own alternative bubblesort method, you won't need to implement a sort function yourself. A predefined sort methodzend_qsortalready exists for use as a callback to zend_hash_sort() leaving you to implement the comparison function only.
int zend_hash_sort(HashTable *ht, sort_func_t sort_func, compare_func_t compare_func, int renumber TSRMLS_DC);
The final parameter to zend_hash_sort(), when set, will toss out any existing associative keys or index numbers and reindex the array based on the result of the sorting operation. The userspace sort() implementation uses zend_hash_sort() in the following manner:
zend_hash_sort(target_hash, zend_qsort, array_data_compare, 1 TSRMLS_CC);
where array_data_compare is a simple compare_func_t implementation that sorts according to the value of the zval*s in the HashTable.