Java:通過HashMap進行迭代,哪種效率更高?

[英]Java : Iteration through a HashMap, which is more efficient?


Given the following code, with two alternative ways to iterate through it,
is there any performance difference between these two methods?

給定以下代碼,有兩種迭代方法,這兩種方法之間是否存在性能差異?

        Map<String, Integer> map = new HashMap<String, Integer>();
        //populate map

        //alt. #1
        for (String key : map.keySet())
        {
            Integer value = map.get(key);
            //use key and value
        }

        //alt. #2
        for (Map.Entry<String, Integer> entry : map.entrySet())
        {
            String key = entry.getKey();
            Integer value = entry.getValue();
            //use key and value
        }

I am inclined to think that alt. #2 is the more efficient means of iterating through the entire map (but I could be wrong)

我傾向於認為alt. 2是遍歷整個映射的更有效的方法(但我可能錯了)

6 个解决方案

#1


55  

Your second options is definitely more efficient since you are doing a lookup only once compared to n number of times in the first option.

您的第二個選項肯定更有效,因為您只執行一次查找,而在第一個選項中執行n次查找。

But, nothing sticks better than trying it out when you can. So here goes -

但是,沒有什么比在力所能及的時候嘗試更好的了。這里是- - - - - -

(Not perfect but good enough to verify assumptions and on my machine anyway)

(不完美,但足夠好去驗證假設和我的機器)

public static void main(String args[]) {

    Map<String, Integer> map = new HashMap<String, Integer>();
    // populate map

    int mapSize = 500000;
    int strLength = 5;
    for(int i=0;i<mapSize;i++)
        map.put(RandomStringUtils.random(strLength), RandomUtils.nextInt());

    long start = System.currentTimeMillis();
    // alt. #1
    for (String key : map.keySet()) {
        Integer value = map.get(key);
        // use key and value
    }
    System.out.println("Alt #1 took "+(System.currentTimeMillis()-start)+" ms");

    start = System.currentTimeMillis();
    // alt. #2
    for (Map.Entry<String, Integer> entry : map.entrySet()) {
        String key = entry.getKey();
        Integer value = entry.getValue();
        // use key and value
    }
    System.out.println("Alt #2 took "+(System.currentTimeMillis()-start)+" ms");
}

RESULTS (Some interesting ones)

一些有趣的結果()

With int mapSize = 5000; int strLength = 5;
Alt #1 took 26 ms
Alt #2 took 20 ms

int mapSize = 5000;int strLength = 5;Alt #1取26 ms Alt #2取20 ms

With int mapSize = 50000; int strLength = 5;
Alt #1 took 32 ms
Alt #2 took 20 ms

int mapSize = 50000;int strLength = 5;Alt 1取32 ms Alt 2取20 ms

With int mapSize = 50000; int strLength = 50;
Alt #1 took 22 ms
Alt #2 took 21 ms

int mapSize = 50000;int strLength = 50;Alt #1取22 ms Alt 2取21 ms。

With int mapSize = 50000; int strLength = 500;
Alt #1 took 28 ms
Alt #2 took 23 ms

int mapSize = 50000;int strLength = 500;Alt #1取28 ms Alt #2取23 ms

With int mapSize = 500000; int strLength = 5;
Alt #1 took 92 ms
Alt #2 took 57 ms

與int mapSize = 500000;int strLength = 5;Alt #1取92 ms Alt #2取57 ms

...and so on

…等等

#2


9  

The second snippet will be slightly faster, since it doesn't need to re-look-up the keys.

第二個代碼片段會稍微快一點,因為它不需要重新查找密鑰。

All HashMap iterators call the nextEntry method, which returns an Entry<K,V>.

所有HashMap迭代器都調用nextEntry方法,該方法返回一個條目

Your first snippet discards the value from the entry (in KeyIterator), then looks it up again in the dictionary.

第一個片段從條目(在KeyIterator中)丟棄值,然后在字典中再次查找它。

Your second snippet uses the key and value directly (from EntryIterator)

第二個代碼片段直接使用鍵和值(來自EntryIterator)

(Both keySet() and entrySet() are cheap calls)

(keySet()和entrySet()都是廉價的調用)

#3


5  

The latter is more efficient than the former. A tool like FindBugs will actually flag the former and suggest you to do the latter.

后者比前者效率更高。像FindBugs這樣的工具實際上會標記前者,並建議您執行后者。

#4


5  

Map:

地圖:

Map<String, Integer> map = new HashMap<String, Integer>();

Map Map = new HashMap ();

Beside the 2 options, there is one more.

除了兩個選項,還有一個。

1) keySet() - use it if you need to use only the keys

1) keySet()——如果您只需要使用密鑰,請使用它

for ( String k : map.keySet() ) {
    ...
}

2) entrySet() - use it if you need both: keys & values

2) entrySet()——如果您同時需要:key和values,請使用它

for ( Map.Entry<String, Integer> entry : map.entrySet() ) {
    String k = entry.getKey();
    Integer v = entry.getValue();
    ...
}

3) values() - use it if you need only the values

值()——如果只需要值,就使用它

for ( Integer v : map.values() ) {
    ...
}

#5


2  

In general, the second one would be a bit faster for a HashMap. It will only really matter if you have lots of hash collisions, since then the get(key) call gets slower than O(1) - it gets O(k) with k being the number of entries in the same bucket (i.e. the number of keys with same hash code or a different hash code which gets still mapped to the same bucket - this depends on the capacity, size and load factor of the map as well).

一般來說,對於HashMap來說,第二個會快一些。只有真正重要的如果你有大量的散列碰撞,此后get(關鍵)調用得到低於O(1)——它得到O(k)k是條目的數量在同一桶(即鍵的數量相同的散列碼或一個不同的哈希代碼仍被映射到相同的桶——這取決於能力的大小和負荷系數映射)。

The Entry-iterating variant does not have to do the lookup, thus it gets a bit faster here.

入口迭代變體不需要進行查找,因此它在這里更快一點。

Another note: If the capacity of your map is a lot bigger than the actual size and you use iterations a lot, you might consider using LinkedHashMap instead. It provides O(size) instead O(size+capacity) complexity for a complete iteration (as well as a predictable iteration order). (You should still measure if this really gives an improvement, since the factors might vary. LinkedHashMap has a bigger overhead for creating the map.)

另一個注意事項:如果您的映射的容量比實際大小大得多,並且您經常使用迭代,那么您可以考慮使用LinkedHashMap。它為完整的迭代(以及可預測的迭代順序)提供O(大小)而不是O(大小+容量)復雜性。(您仍然應該衡量這是否真的帶來了改善,因為這些因素可能會有所不同。LinkedHashMap用於創建映射的開銷更大。)

#6


2  

bguiz,

bguiz,

I think (I don't know) that iterating the EntrySet (alternative 2) is marginally more efficient, simply because it doesn't hash each key in order to get it's value... Having said that, calculating the hash is an O(1) operation per entry, and therefore we're ONLY talking O(n) over the whole HashMap... but note that all this applies to HashMap only... other implementations of Map may have VERY different performance characteristics.

我認為(我不知道)迭代EntrySet(備選方案2)會稍微提高效率,因為它不會為了獲得值而對每個鍵進行哈希……話雖如此,計算哈希是每個條目的O(1)操作,因此我們只在整個HashMap上討論O(n)……但請注意,所有這些只適用於HashMap…Map的其他實現可能具有非常不同的性能特征。

I do think you'd be "pushing it" to actually NOTICE the difference in performance. If you are concerned then why not setup a test-case to time both iteration techniques?

我確實認為你是在“推動它”去注意性能的差異。如果您關心這個問題,那么為什么不設置一個測試用例來計時兩個迭代技術呢?

If you don't have a REAL, reported performance issue, then you're really worrying about not very much... A few clock ticks here and there won't affect the overall usability of your program.

如果你沒有一個真實的,報告的性能問題,那么你真的不太擔心……這里有幾個時鍾滴答作響,不會影響程序的整體可用性。

I believe that many, many other aspects of the code are typically more important than outright performance. Of course some blocks are "performance critical", and this is known BEFORE it's even written, let-alone performance tested... but such cases are fairly rare. As a general approach it's better to focus on writing complete, correct, flexible, testable, reusable, readable, maintainable code... performance CAN be built in later, as need arises.

我相信代碼的許多其他方面通常比直接的性能更重要。當然,有些塊是“性能關鍵”,這在編寫、單獨測試性能之前就已經知道了……但這種情況相當罕見。作為一種通用的方法,最好專注於編寫完整的、正確的、靈活的、可測試的、可重用的、可讀的、可維護的代碼……當需要時,性能可以稍后構建。

Version 0 should be AS SIMPLE AS POSSIBLE, without any "optimizations".

版本0應該盡可能簡單,沒有任何“優化”。


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2011/04/28/72131aae76582bf1cc616eb8fd2e290f.html



 
粤ICP备14056181号  © 2014-2021 ITdaan.com