Java - Simple HashTable implementation

Since Java supports HashMap, there is no need to implement HashTable. However, if you want to know how the algorithm works, or if you do not want to use the library in an algorithm test, you must implement it yourself.

There are many different ways to create a HashTable, and each method has different performance. In this article, I have implemented a very simple HashTable in Java.

HashTable

HashTable refers to storing the data of (key, value) in a table. It is guaranteed to be very fast enough to store or retrieve data in O(1). What makes this speed possible is changing the key to a hash.

Since different keys can have the same hash, there is a difference in memory consumption and search time depending on how you handle this.

The method implemented in this article tries to see if data with the same hash can be stored in the next table with hash + 1 if there is an already stored value. It would be nice to apply the Chain method like Linked List here, but if you think that there will be few duplicate hashes, the performance difference will not be large even if implemented in the same way as hash + 1.

Example

The code below is an example of a simple implementation of HashTable.

It provides the following methods:

  • get(key)
  • put(key, value)
  • remove(key)
  • size()
  • toString()
public static class Item {
    public String key;
    public Object value;

    public Item(String key, Object value) {
        this.key = key;
        this.value = value;
    }
}

public static class HashTable {
    private static int HASH_TABLE_CAPACITY = 1000;
    private Item[] data = new Item[HASH_TABLE_CAPACITY];
    private int size = 0;

    private int getHash(String key) {
        int hash = 0;
        for (int i = 0; i < key.length(); i++) {
            char val = key.charAt(i);
            hash = (hash + val*(i + 1)) % HASH_TABLE_CAPACITY;
        }
        return hash;
    }

    public Object get(String key) {
        if (key != null) {
            int hash = getHash(key);
            while (data[hash] != null && !data[hash].key.equals(key)) {
                hash = (hash + 1) % HASH_TABLE_CAPACITY;
            }
            return data[hash] != null ? data[hash].value : null;
        }
        return null;
    }

    public void put(String key, Object value) {
        if (key != null) {
            int hash = getHash(key);
            while (data[hash] != null && !data[hash].key.equals(key)) {
                hash = (hash + 1) % HASH_TABLE_CAPACITY;
            }
            data[hash] = new Item(key, value);
            size++;
        }
    }

    public Object remove(String key) {
        Object removed = null;
        if (key != null) {
            int hash = getHash(key);
            while (data[hash] != null && !data[hash].key.equals(key)) {
                hash = (hash + 1) % HASH_TABLE_CAPACITY;
            }
            if (data[hash] != null) {
                removed = data[hash];
                data[hash] = null;
                size--;
            }
        }
        return removed;
    }

    public int size() {
        return size;
    }

    public String toString() {
        String out = "<HashTable>\n";
        for (int i = 0; i < data.length; i++) {
            Item item = data[i];
            if (item != null) {
                out += "  Key(hash, index): " + data[i].key
                        + "(" + getHash(data[i].key) + ", " + i + ")"
                        + ", Value: " + data[i].value + "\n";
            }
        }
        return out;
    }
}
  • The table array size is set to 1000. It can also be implemented to dynamically resize the array by passing the size as an argument.
  • In order to implement very simply, we have saved the Object without using Generics.
  • Implemented so that null is not allowed as a key
  • When calculating the hash of String, it is implemented directly without using String.hashCode(). I multiplied the ASCII by the string index and divided that sum by the size of the table. Applying a more complex algorithm seems less likely to cause hash collisions

Execution

I checked whether the HashTable implemented as follows works well.

HashTable hashTable = new HashTable();

hashTable.put("Tokyo", "Japan");
hashTable.put("Seoul", "Korea");
hashTable.put("Beijing", "China");
hashTable.put("Paris", "France");
hashTable.put("Washington", "USA");
hashTable.put("Brazilia", "Brazil");
System.out.println("Size: " + hashTable.size());

System.out.println((String) hashTable.get("Tokyo"));
System.out.println((String) hashTable.get("Seoul"));
System.out.println((String) hashTable.get("Beijing"));
System.out.println((String) hashTable.get("Paris"));
System.out.println((String) hashTable.get("Washington"));
System.out.println((String) hashTable.get("Brazilia"));

hashTable.remove("Brazilia");
System.out.println("Size: " + hashTable.size());
System.out.println((String) hashTable.get("Brazilia"));
System.out.println(hashTable.toString());

Running the above code will output the following:

Size: 6
Japan
Korea
China
France
USA
Brazil
Size: 5
null
<HashTable>
  Key(hash, index): Paris(611, 611), Value: France
  Key(hash, index): Seoul(626, 626), Value: Korea
  Key(hash, index): Tokyo(666, 666), Value: Japan
  Key(hash, index): Beijing(913, 913), Value: China
  Key(hash, index): Washington(975, 975), Value: USA

If you look at the result of toString() in the example above, there is no duplicate hash, so all are stored in the same index as the hash.

If you make it return the same hash as follows, it will be stored in the next index like hash + 1. Hash collisions are bad for performance, but if you think hash collisions are rare, you can opt for a simpler implementation.

private int getHash(String key) {
    return 0;
}

In the example above, if you make the hash return 0 and run the code, you can see that the hash is the same, but the stored index is different as follows

<HashTable>
  Key(hash, index): Tokyo(0, 0), Value: Japan
  Key(hash, index): Seoul(0, 1), Value: Korea
  Key(hash, index): Beijing(0, 2), Value: China
  Key(hash, index): Paris(0, 3), Value: France
  Key(hash, index): Washington(0, 4), Value: USA
codechachaCopyright ©2019 codechacha