Name

KeyedCollection, KeyedIndex -- supertype of Set and Map types


Synopsis

//
// KeyedCollection -- supertype of Set and Map types 
//
@deftype KeyedCollection <Collection>
CREATING
- (void)	setDuplicatesOption: duplicatesOption;
- (void)	setGroupType: aCollectionType;
- (void)	setSingletonGroups: (BOOL)singletonGroups;

- (void)	setSorted: (BOOL)sorted;
- (void)	setCompareFunction: (idXid_t)aFunction;
- (void)        setBucketFunction: (bucket_t)aFunction setBucketCount: (int)n;

- (void)	setPartiallyOrdered: (BOOL)partiallyOrdered;
- (void)	setMemberSlot: (int)byteOffset;
SETTING
- (void)	setPartialOrderContext: aKeyedCollectionCollection;
USING
-		getDupOption;
-		getGroupType;
- (BOOL)	getSingletonGroups;

- (BOOL)	getSorted;
- (int(*)())	getCompareFunction;
- (int(*)())	getBucketFunction;
- (int)		getMaxBuckets;

- (BOOL)	getPartiallyOrdered;
-		getPartialOrderContext;

- (int)		getMemberSlot;

-		at: aKey;
- (BOOL)	containsKey: aKey;
- (int)		getCountAtKey: aKey;

-		removeKey: aKey;

-		createIndex: aZone at: aKey;
-		createIndex: aZone setMember: aMember;

- (void)	getPredecessors: aMember;
- (void)	getSuccessors: aMember;
- (void)	getCostarts: aMember;
- (void)	getCoends: aMember;
@end

//
// KeyedIndex -- index behavior shared by Set and Map types
//
@deftype KeyedIndex 
- (int)		getCurrentLevel;
-		getIndexAtLevel: (int)level;

-		at: aKey;

-		setMember: aMember;
@end

Description

A keyed collection is a collection in which each member can be compared with some other value that identifies the member. This value is referred to as the member key. The key value may be determined either by the member value itself, which defines a Set, or by external association with the member when the member is first added, which defines a Map.

The keyed collection type establishes the common behavior shared by both Set and Map. Standard options are provided to declare ordering of members in the collection, which can be based either on an inherent ordering of the key values (a sorted collection) or external maintenance of a total or partial order. Additionally, options are provided to declare whether duplicate key values should be permitted, and if so, how.

The options for handling of duplicate keys provide considerable flexibility. Entire structures of nested collections can be created that subdivide the key space through progressive levels. By accepting duplicate keys, and organizing members of the same identity into subgroups, keyed collections support the abstract data type known as a bag or multiset. The index on a keyed collection includes the ability to traverse throughout the contents of any of these nested collections.


Inherited behavior

@deftype KeyedCollection <Collection>
The KeyedCollection type inherits all standard behavior of Collection. The KeyedCollection type is not itself creatable; it only serves as a common supertype for Set and Map collection types.

The default value of IndexSafety for a KeyedCollection is Unsafe, but values of either UnsafeAtMember or SafeAlways can be requested to guarantee a higher level of safety.

Create-time options

(.. None of these options is implemented yet, except for CompareFunction.)

CREATING
- (void)	setDupOption: dupOption;
- (void)	setGroupType: aCollectionType;
- (void)	setSingletonGroups: singletonGroups;

- (void)	setSorted: (BOOL)sorted;
- (void)	setCompareFunction: (idXid_t)aFunction;

- (void)	setPartiallyOrdered: (BOOL)partiallyOrdered;

- (void)	setMemberSlot: (int)byteOffset;
SETTING
- (void)        setBucketFunction: (bucket_t)aFunction setBucketCount: (int)n;
- (void)	setPartialOrderContext: aKeyedCollection;
USING
-		getDupOption;
-		getGroupType;
- (BOOL)	getSingletonGroups;

- (BOOL)	getSorted;
- (int(*)())	getCompareFunction;

- (BOOL)	getPartiallyOrdered;
-		getPartialOrderContext;

- (int)		getMemberSlot;

- (int(*)())	getBucketFunction;
- (int)		getBucketCount;

DupOption

The value of DupOption specifies how duplicate key values are to be handled within a keyed collection. This option has one of the following values:
id <Symbol>  DupIsError, DupRejected, KeepDuplicates, KeepCountOnly;
If the DupIsError value is specified, any attempt to insert a duplicate key is an error, and effects of such an attempt depend on the level of debug checking. If debug checking for this error is turned on, the collection will check whether any new key duplicates an existing one, and raise a DuplicateKey error if it does. If debug checking is turned off, there is no guarantee that any check will be made, and the effects could be unpredictable in or catastrophic or insidious ways.

With a value of DupRejected, attempts to insert a duplicate key are always detected, and the attempt rejected. (All insert messages return a boolean flag as to whether a duplicate occurred.)

With a value of KeepDuplicates, attempts to insert a duplicate key are always accepted, and all members having the same key are retained in a specially created internal collection containing just the members that all have that same key. This collection then becomes the single member internally associated with the key value.

With a value of KeepCountOnly, duplicate members are not retained. Instead, only a count of the net number of insertions and removals at that key value are retained within the collection. With this value, member values cannot be retrieved from the collection, but only a count of members for a given key value, using different messages. This value is accepted only for the Set subtype of a KeyedCollection.

GroupType

The GroupType option specifies the type of internal collection to be created for members that all have the same key. This option is valid only if DupOption is also set to a value of KeepDuplicates.

Without this option, the type of internal collection created for members with the same key depends on the specific subtype. This option allows an external program to customize the type of internal collection permitted. The argument must be a previously created subtype of Collection. Typically, the argument is a customized version of one of the standard collection types such as List, Set, or Map; the argument would be the type object returned by customizeEnd after a customizeBegin: sequence.

More information about all the ways this option can be interpreted is provided with each subtype.

SingletonGroups

If this option is true, the type of collection created for members with duplicate keys is also created even when there is only member for a particular key. This option can be convenient to assure uniformity of a keyed collection with nested structure. This option is valid only if DupOption is also set to a value of KeepDuplicates.

Sorted

If this option is true, the immediate members of a keyed collection (included collections of duplicate members, if any) are totally ordered according to an ordering relation defined on key values. The default order for sorting is determined by a standard compare: message on a key value to be added vs. existing keys of the collection. This default method to establish order can be overridden using the CompareFunction option.

The default for the Sorted function is true, unless a value is also given for the option BucketFunction.

CompareFunction

Use the function pointed to by the argument as the method for determining whether two members have the same key value, and also for ordering the keys of the collection if the Sorted option is true. The argument is declared with the following function pointer type:
typedef int (*idXid_t)( id, id );
The function given will be called whenever one key value needs to be compared with another. Multiple calls to the function typically occur whenever members are added or removed from the collection, until the correct member for insertion or removal is determined.

The compare function accepts two key values as its arguments. Like member types, key types declared with the id pointer type, but may contain any type of value (such as an integer) up to the size of an id. An explicit compare function can support any type of value as a key, regardless of whether it is an object that supports a standard compare: message.

The compare function is called repeatedly by the collection to compare two key values. The function should return zero if the key values are equal, -1 for the first key argument less than the second, and +1 for the first greater than the second. If a keyed collection is not sorted, either -1 or +1 may be returned for unequal keys, regardless of whether one might be taken as greater or less than the other.

BucketFunction

The BucketFunction option specifies a function that is called whenever a member of an unsorted collection is added or removed. It may also be used for a sorted collection to provide faster insertion and removal of members. The responsibility of a "bucket function" is to assign the key value to some integer in a range from zero up to some maximum value. These integers identify "buckets" to which the same key value will also be assigned whenever passed to the function. If the collection is unsorted, these buckets represent hash buckets, and for speed should be relatively sparse (few keys likely in any one bucket, given the expected size of the collection). If the collection is sorted, these buckets represent ordered buckets to which the same range of key values will always be assigned. Such sorted buckets can speed up the search for a key location within a sorted collection. Sorted buckets can also be sparse to minimize the need for further sorting within each bucket. If many keys end up in the same sorted bucket, the sorting process normally used without buckets is still used, so there is less performance impact than with hash keys of an unsorted collection.

A bucket function accepts a single key value as its argument, and return an integer that identifies a bucket. As with a compare function, the key value is declared as an id pointer type, but may actually contain whatever kind of value is used within the collection. Following is the type of the bucket function pointer:

typedef int (*bucket_t)( id );
A new bucket function may be established any time during the lifetime of a collection. A new bucket function can ensure that keys remain suitably distributed to buckets as the total number of keys in a collection grows or shrinks.

BucketCount

A value for MaxBuckets must be given in the same message that sets a bucket function. Its value specifies the number of buckets to which a bucket function assigns its keys. The maximum value returned by a bucket function should be one less than this value.

PartiallyOrdered

A partially ordered collection supports the maintenance of individual ordering relations among its members. Each such relation is an assertion that one member should always precede another in the enumeration sequence of the collection. When an index traverses the collection, it will guarantee that all the predecessors of a member have already been visited before the member itself is reached.

If true, PartiallyOrdered specifies that messages for maintaining a partial order are enabled on the collection. An error will be raised if individual members do not have a unique identity within the innermost collection within which they are contained. This means that if duplicate keys are permitted by the collection, a collection type must also be specified for GroupType which does not itself accept duplicates. Every member in a partial order must have a unique identity so that the member itself is sufficient to identify it within the network of partial orders.

PartialOrderContext: aKeyedCollectionCollection

Ordering relations may be maintained not only between members belonging to the same collection, but between members and other collections, as long as all participants in an ordering are contained somewhere within a larger common collection. An ordering relation with a nested collection specifies that all members in the collection are successors or predecessors of the other member. Ordering relations can also be established between members of a larger, containing collection, and individual members of a nested collection.

The keyed collection specified as the value for PartialOrderContext must specify another collection in which the local collection is maintained. It is an error if the PartialOrderContext collection does not actually contain the local collection.

getMemberSlot

The MemberSlot option indicates that space has been reserved within each member that allows it to contain a link directly to its position in the enumeration for a collection. If such space has been reserved, special messages can be used that rapidly position an index directly to the member. Operations to remove members are also much faster.

The value of MemberSlot specifies the offset in bytes from the start of each member where the space for its position link has been reserved. The following typedefs are provided to declare the space required for a position link within a member allocation:

typedef struct memberData { void *memberData[2]; }            member_t;
typedef struct memberData { void *memberData[2]; id owner; }  dupmember_t;

The first type declares the required space for a position link in a collection that does not accept duplicate key values. The second type is required if duplicate keys are permitted.

The offset of the position link from the start of a member may be either positive or negative, in the range of -2048 to +2047. The default value of MemberSlot is UnknownOffset (a large negative value), which specifies that no slot for an internal position link is available within each member.

If a member slot is defined, a memory pointer of some kind must always be used for the value of every member. These members may be object id pointers, but other types of memory pointers are acceptable as well.

The contents of the first two words of position link vary according to the position of a member in a collection. The third word of a dupmember_t link contains the id of the collection in which a member is directly contained. This collection is either an internally created collection containing members with duplicate keys, or the collection to which the member was added if there is no other member with the same key.


KeyedCollection behavior

(.. None of the messages or other behavior relating to duplicate key values are currently implemented.)
-		at: aKey;
- (BOOL)	containsKey: aKey;
- (int)		getCountAtKey: aKey;  // unimplemented

-		removeKey: aKey;

The at: message returns the existing member of the collection which matches the key value passed as its argument, or nil if there is no key value in the collection which matches. If duplicate entries for this key exist, the entire collection of duplicate members created for the key value is returned instead.

The containsKey: message returns true if the key value passed as its argument is contained in the collection, and false otherwise. The getCountAtKey message returns the number of member entries which are all associated with its key value argument. This number is zero if the key value is not contained in the collection, and will never exceed one one unless a collection accepts duplicate entries for a key.

The removeKey: message removes a member matching a key value from the collection, and returns the member just removed. It returns nil if there is no key value in the collection which matches. If more than one entry was present for the key value, it removes and returns the first member in the internal collection created for duplicate members.

-		createIndex: aZone at: aKey;
-		createIndex: aZone setMember: aMember;
(.. Neither of these messages is currently implemented.)

The createIndex:at: message creates a new index on the collection and also sets its current position to the first member matching the key value given as its final argument. It returns the index just created.

The createIndex:at: message creates a new index on the collection and also sets its current position to the member passed as its final argument. This message is valid only if an internal member slot was defined for the collection with the MemberSlot option.

- (void)	getPredecessors: aMember;
- (void)	getSuccessors: aMember;
- (void)	getCostarts: aMember;
- (void)	getCoends: aMember;
(.. None of these messages is currently implemented.) These messages maintain the ordering relations within a partially ordered collection.

KeyedIndex behavior

- (int)		getCurrentLevel;
-		getIndexAtLevel: (int)level;
An index to a keyed collection traverses all members of the collection, regardless of whether these members belong to collections of members entered under duplicate key values. Internally, however, an index keeps track of any specific subcollection it is currently processing. The getCurrentLevel message returns the level of nesting, within collections of duplicate entries for a current key value, to which the current position of an index belongs. The level returned is an integer that starts at zero for the top-level collection, and increases by one for each level of nested collection that is currently active.

While processing any nested collections, the index for a keyed collection actually maintains a separate, internal index for each level of collection that is currently active. The index at each level may be retrieved using the getIndexAtLevel: message.

-		at: aKey;
The at: message repositions the index to the first member that matches the key value passed as its argument.

-		setMember: aMember;
(.. This message is not currently implemented.) The setMember: message repositions the index to the member passed as its argument. This message is valid only if an internal member slot was defined for the collection with the MemberSlot option.


Roger Burkhart <rmb@santafe.edu>
Last modified: