Managing collisions

Managing Collisions

As a DSP handling big data and high request rate (as well as in any busy and complex server) we find it useful to keep in-memory mappings for data we use on request basis. Typically these mappings are stored in hashmap data structures, whose performance is highly influenced by the data distribution inside its buckets, or put differently, by the number of collisions.

A second use-case of interest which also involves collisions is when we try to map a set of N unique ids into a new domain of size B, in an attempt to reduce the space used for storing ids in internal data structures and data stores.

In both cases collisions should be taken into account. In the case of a hashmap we are dealing with a trade-off between memory consumption and running time. A larger table means faster lookups/updates, but slower iteration and greater memory consumption. In the mapping scenario, we may be saving space by reducing the id signature size, but on the same time we are losing data due to collisions. In both cases we would like to be able to control the number of collisions, keeping them below some predefined threshold.

private static void analyze(HashMap<?,?> m) throws ReflectiveOperationException {
      Map.Entry<?,?>[] entries = getTable(m);
      System.out.println("Bins: " + entries.length);
      System.out.println("Items: " + m.size());
      Field nextField = getNextField(entries);
    
      int collisions = 0;
      for (Map.Entry<?,?> e : entries) {
        if (e != null) {
          Object next = e;
          while ((next = nextField.get(next)) != null) {
           collisions++; 
          }
        }
      }
      System.out.println("Collisions: " + collisions);
    }
    
    private static Map.Entry<?, ?>[] getTable(Map<?, ?> map) throws ReflectiveOperationException {
      Field tableField = map.getClass().getDeclaredField("table");
      tableField.setAccessible(true);
      return (Map.Entry<?, ?>[]) tableField.get(map);
    }
    
    private static Field getNextField(Map.Entry<?,?>[] mapTable) throws ReflectiveOperationException {
      Class<?> nodeType = mapTable.getClass().getComponentType();
      Field nextNodeField = nodeType.getDeclaredField("next");
      nextNodeField.setAccessible(true);
      return nextNodeField;
    }
    

Let’s try it on a HashMap with many random strings as keys. Internally, HashMap uses powers of 2 for table sizes.

If we want to have a saturated map with 2^20=1,048,576 bins, we need to insert 0.75*1,048,576 = 786432 items:

public static void main(String[] args) throws ReflectiveOperationException {
    
      final int N = 786_432;
    
      HashMap<String, Void> m = new HashMap<>();
      for (int i = 0; i < N; i++) {
        m.put(UUID.randomUUID().toString(), null);
      }
    
      analyze(m);
    }
Logo Polaris

Want to look
under the hood?

Enter Ubimo. With its unique model, Ubimo has disrupted the traditional way of media-buying by offering tech companies and advertisers alike an all-inclusive platform designed with the buyer’s best interest in mind. Hence why I’m writing to you today, to tell you what I love about Ubimo’s platform – and why it’s here to stay.

Eyal Schneider
Eyal Schneider

Lead Engineer

Share on linkedin
Share on twitter
Share on facebook
The Holiday Guide to Digital Out-of-Home in NYC
Articles & Research

The Holiday Guide to Digital Out-of-Home in NYC

With Ubimo’s Polaris planning tool, we created a guide on how to reach a brand’s audience during the holidays in New York City with Digital Out-of-Home (DOOH)

Articles & Research

Premium Stores vs. Outlets: Who’s Winning Consumer Dollars?

Exploring Nordstrom, Saks, and Neiman Marcus Audiences and Retail Strategy

Get the latest in
Location Intelligence

Subscribe to the blog

black friday analysis ubimo
Articles & Research

Data That Will Make Black Friday Campaigns Hit Home

We used Ubimo’s Polaris to analyze location data from Black Friday weekend last year to see if any patterns emerged

ubimo acquired by quotient technology 2019
News

Ubimo’s Exciting News

We are joining the Quotient family! Ubimo will operate as an independent product group within Quotient’s media business

martech quotient ubimo
News

CPG marketing platform Quotient buys location data provider Ubimo

The deal continues Quotient’s evolution beyond its origins as Coupons.com