Managing collisions

Managing Collisions

As a DSP handling big data and high request rate (as well as in any busy and complex server) we find it useful to keep in-memory mappings for data we use on request basis. Typically these mappings are stored in hashmap data structures, whose performance is highly influenced by the data distribution inside its buckets, or put differently, by the number of collisions.

A second use-case of interest which also involves collisions is when we try to map a set of N unique ids into a new domain of size B, in an attempt to reduce the space used for storing ids in internal data structures and data stores.

In both cases collisions should be taken into account. In the case of a hashmap we are dealing with a trade-off between memory consumption and running time. A larger table means faster lookups/updates, but slower iteration and greater memory consumption. In the mapping scenario, we may be saving space by reducing the id signature size, but on the same time we are losing data due to collisions. In both cases we would like to be able to control the number of collisions, keeping them below some predefined threshold.

private static void analyze(HashMap<?,?> m) throws ReflectiveOperationException {
      Map.Entry<?,?>[] entries = getTable(m);
      System.out.println("Bins: " + entries.length);
      System.out.println("Items: " + m.size());
      Field nextField = getNextField(entries);
    
      int collisions = 0;
      for (Map.Entry<?,?> e : entries) {
        if (e != null) {
          Object next = e;
          while ((next = nextField.get(next)) != null) {
           collisions++; 
          }
        }
      }
      System.out.println("Collisions: " + collisions);
    }
    
    private static Map.Entry<?, ?>[] getTable(Map<?, ?> map) throws ReflectiveOperationException {
      Field tableField = map.getClass().getDeclaredField("table");
      tableField.setAccessible(true);
      return (Map.Entry<?, ?>[]) tableField.get(map);
    }
    
    private static Field getNextField(Map.Entry<?,?>[] mapTable) throws ReflectiveOperationException {
      Class<?> nodeType = mapTable.getClass().getComponentType();
      Field nextNodeField = nodeType.getDeclaredField("next");
      nextNodeField.setAccessible(true);
      return nextNodeField;
    }
    

Let’s try it on a HashMap with many random strings as keys. Internally, HashMap uses powers of 2 for table sizes.

If we want to have a saturated map with 2^20=1,048,576 bins, we need to insert 0.75*1,048,576 = 786432 items:

public static void main(String[] args) throws ReflectiveOperationException {
    
      final int N = 786_432;
    
      HashMap<String, Void> m = new HashMap<>();
      for (int i = 0; i < N; i++) {
        m.put(UUID.randomUUID().toString(), null);
      }
    
      analyze(m);
    }
Logo Polaris

Want to look
under the hood?

Enter Ubimo. With its unique model, Ubimo has disrupted the traditional way of media-buying by offering tech companies and advertisers alike an all-inclusive platform designed with the buyer’s best interest in mind. Hence why I’m writing to you today, to tell you what I love about Ubimo’s platform – and why it’s here to stay.

Eyal Schneider
Eyal Schneider

Lead Engineer

Share on linkedin
Share on twitter
Share on facebook
cpg case study
Case Studies

Case Study: CPG Client Achieves $11+ ROAS Using DOOH

Capturing Consumer Attention En Route to the Store Leads to Measurable Results

DOOH Growth Trajectory in 2021
News

DOOH Growth Trajectory in 2021

After a year spending a great deal of time at home and on screens, people want to get outside—making digital out-of-home (DOOH) a huge opportunity this year.

Get the latest in
Location Intelligence

Subscribe to the blog

Ubimo on Alcohol and Snack Sales Trends
News

Ubimo on Alcohol and Snack Sales Trends

Quotient’s data shows that both alcohol and snack sales have increased during COVID as people seek comfort and enjoyment while social distancing at home.

VIOOH and Ubimo Announce Partnership to Offer Data-Driven Digital Out-of-Home (DOOH) Advertising on JCDecaux Inventory
News

VIOOH and Ubimo Announce Partnership to Offer Data-Driven Digital Out-of-Home (DOOH) Advertising on JCDecaux Inventory

Advertisers can reach targeted audiences across JCDecaux’s 655 digital screens in the U.S.

VIOOH and Ubimo, a Quotient Brand, Announce Partnership
News

VIOOH and Ubimo, a Quotient Brand, Announce Partnership

Partnering with VIOOH allows advertisers to reach targeted audiences across JCDecaux’s DOOH screens in the U.S. through our programmatic media platform.