了解 Dijkstra 算法的时间复杂度计算

根据我的理解,我使用下面给出的邻接列表计算了 Dijkstra 算法作为大 O 符号的时间复杂度。它并没有像预期的那样出现,这让我一步一步地理解了它。

  1. 每个顶点可以连接到(V-1)顶点,因此每个顶点的相邻边数是 V-1。假设 E 代表连接到每个顶点的 V-1边。
  2. 在最小堆中查找和更新每个相邻顶点的权重为 O (log (V)) + O (1)或 O(log(V))
  3. 因此,从上面的步骤1和步骤2开始,更新一个顶点的所有相邻顶点的时间复杂度是 E * (logV)。或 E*logV
  4. 因此,所有 V 顶点的时间复杂度为 V * (E * logV) ,即 O(VElogV)

但是 Dijkstra 算法的时间复杂度是 O (ElogV) ,为什么?

159009 次浏览

Dijkstra's shortest path algorithm is O(ElogV) where:

  • V is the number of vertices
  • E is the total number of edges

Your analysis is correct, but your symbols have different meanings! You say the algorithm is O(VElogV) where:

  • V is the number of vertices
  • E is the maximum number of edges attached to a single node.

Let's rename your E to N. So one analysis says O(ElogV) and another says O(VNlogV). Both are correct and in fact E = O(VN). The difference is that ElogV is a tighter estimation.

let n be the number of vertices and m be the number of edges.

Since with Dijkstra's algorithm you have O(n) delete-mins and O(m) decrease_keys, each costing O(logn), the total run time using binary heaps will be O(log(n)(m + n)). It is totally possible to amortize the cost of decrease_key down to O(1) using Fibonacci heaps resulting in a total run time of O(nlogn+m) but in practice this is often not done since the constant factor penalties of FHs are pretty big and on random graphs the amount of decrease_keys is way lower than its respective upper bound (more in the range of O(n*log(m/n), which is way better on sparse graphs where m = O(n)). So always be aware of the fact that the total run time is both dependent on your data structures and the input class.

In dense(or complete) graph, E logV > V^2

Using linked data & binary heap is not always best.

That case, I prefer to use just matrix data and save minimum length by row.

Just V^2 time needed.

In case, E < V / logV.

Or, max edges per vertex is less than some constant K.

Then use binary heap.

Let's try to analyze the algorithm as given in CLRS book.

enter image description here

for each loop in line 7: for any vertex say 'u' the number of times the loop runs is equal to the number of adjacent vertices of 'u'. The number of adjacent vertices for a node is always less than or equal to the total number of edges in the graph.

If we take V (because of while loop in line 4) and E (because of for each in line 7) and compute the complexity as VElog(V) it would be equivalent to assuming each vertex has E edges incident on it, but in actual there will be atmost or less than E edges incident on a single vertex. (the atmost E adjacent vertices for a single vertex case happens in case of a star graph for the internal vertex)

Adding a more detailed explanation as I understood it just in case:

  • O(for each vertex using min heap: for each edge linearly: push vertices to min heap that edge points to)
  • V = number of vertices
  • O(V * (pop vertex from min heap + find unvisited vertices in edges * push them to min heap))
  • E = number of edges on each vertex
  • O(V * (pop vertex from min heap + E * push unvisited vertices to min heap)). Note, that we can push the same node multiple times here before we get to "visit" it.
  • O(V * (log(heap size) + E * log(heap size)))
  • O(V * ((E + 1) * log(heap size)))
  • O(V * (E * log(heap size)))
  • E = V because each vertex can reference all other vertices
  • O(V * (V * log(heap size)))
  • O(V^2 * log(heap size))
  • heap size is V^2 because we push to it every time we want to update a distance and can have up to V comparisons for each vertex. E.g. for the last vertex, 1st vertex has distance 10, 2nd has 9, 3rd has 8, etc, so, we push each time to update
  • O(V^2 * log(V^2))
  • O(V^2 * 2 * log(V))
  • O(V^2 * log(V))
  • V^2 is also a total number of edges, so if we let E = V^2 (as in the official naming), we will get the O(E * log(V))

V:Number of Vertices, E:Number of total_edges Suppose the Graph is dense The complexity would be O(V*logV) + O( (1+2+...+V)*logV)

1+2+....+(V-1) = (v)*(v+1)/2 ~ V^2 ~ E because the graph is dense So the complexity would be O(ElogV).

The sum 1+2+...+ V refers to: For each vertex v in G.adj[u] but not in S If you think about Q before a vertex is extracted has V vertices then it has V-1 then V-2 ... then 1.

enter image description here

I find it easier to think at this complexity in the following way:

  1. The nodes are first inserted in a priority queue and the extracted one by one leading to O(V log V).
  2. Once a node is extracted, we iterate through its edges and update the priority queue accordingly. Note that every edge is explored only once, moreover, updating the priority queue is O(log V), leading to an overall O(E log V).

TLDR. You have V extractions from the priority queue and E updates to the priority queue, leading to an overall O((V + E) log V).

  • E is edges and V is vertices. Number of edges

     (V *(V-1)) / 2
    

approximately

    V ^ 2
  • So we can add maximum V^2 edges to the min heap. So sorting the elements in min heap will take

     O(Log(V ^ 2))
    
  • Every time we insert a new element into min heap, we are going to sort. We will have E edges so we will be sorting E times. so total time complexity

    O(E * Log(V ^ 2)= O( 2 * E * Log(V))
    

Omitting the constant 2:

 O( E * Log(V))