计算机科学中的排序与“真实”世界中的排序

我在考虑软件中的排序算法,以及克服 O(nlogn)路障的可能方法。我不认为在实际意义上可以更快地排序,所以请不要认为我这样做。

也就是说,在几乎所有的排序算法中,软件必须知道每个元素的位置。这是有意义的,否则,它如何知道根据某些排序条件将每个元素放置在哪里?

但是,当我把这种想法与现实世界进行交叉比较时,离心机根本不知道每个分子按照密度对分子进行“分类”时所处的位置。事实上,它并不关心每个分子的位置。然而,由于每个分子都遵循密度和引力定律这一事实,它可以在相对较短的时间内对数以万亿计的物质进行分类——这让我思考起来。

是否有可能在每个节点上增加一些开销(在每个节点上添加一些值或方法)来“强制”列表的顺序?有点像离心机,只有每个元素关心它在空间中的相对位置(相对于其他节点)。或者,这是否违反了计算中的某些规则?

我认为这里提到的重点之一是自然界的量子力学效应以及它们如何同时应用于所有粒子。

也许经典的计算机本质上将排序限制在 O(nlogn)的领域,而量子计算机可以跨越这个阈值进入并行运行的 O(logn)算法。

离心机基本上是 平行气泡排序平行气泡排序的观点似乎是正确的,它的时间复杂度为 O(n)

我想下一个想法是,如果自然界能够在 O(n)中排序,为什么计算机不能呢?

9491 次浏览

The trick is there, that you only have a probability of sorting your list using a centrifuge. As with other real-world sorts [citation needed], you can change the probability that your have sorted your list, but never be certain without checking all the values (atoms).

Consider the question: "How long should you run your centrifuge for?"
If you only ran it for a picosecond, your sample may be less sorted than the initial state.. or if you ran it for a few days, it may be completely sorted. However, you wouldn't know without actually checking the contents.

Would it be possible with some overhead on each node (some value or method tacked on to each of the nodes) to 'force' the order of the list?

When we sort using computer programs we select a property of the values being sorted. That's commonly magnitude of the number or the alphabetical order.

Something like a centrifuge, where only each element cares about its relative position in space (in relation to other nodes)

This analogy aptly reminds me of simple bubble sort. How smaller numbers bubble up in each iteration. Like your centrifuge logic.

So to answer this, don't we actually do something of that sort in software based sorting?

A real world example of a computer based "ordering" would be autonomous drones that cooperatively work with each other, known as "drone swarms". The drones act and communicate both as individuals and as a group, and can track multiple targets. The drones collectively decide which drones will follow which targets and the obvious need to avoid collisions between drones. The early versions of this were drones that moved through way points while staying in formation, but the formation could change.

For a "sort", the drones could be programmed to form a line or pattern in a specific order, initially released in any permutation or shape, and collectively and in parallel they would quickly form the ordered line or pattern.

Getting back to a computer based sort, one issue is that there's one main memory bus, and there's no way for a large number of objects to move about in memory in parallel.

know the position of each element

In the case of a tape sort, the position of each element (record) is only "known" to the "tape", not to the computer. A tape based sort only needs to work with two elements at a time, and a way to denote run boundaries on a tape (file mark, or a record of different size).

Computational complexity is always defined with respect to some computational model. For example, an algorithm that's O(n) on a typical computer might be O(2n) if implemented in Brainfuck.

The centrifuge computational model has some interesting properties; for example:

  • it supports arbitrary parallelism; no matter how many particles are in the solution, they can all be sorted simultaneously.
  • it doesn't give a strict linear sort of particles by mass, but rather a very close (low-energy) approximation.
  • it's not feasible to examine the individual particles in the result.
  • it's not possible to sort particles by different properties; only mass is supported.

Given that we don't have the ability to implement something like this in general-purpose computing hardware, the model may not have practical relevance; but it can still be worth examining, to see if there's anything to be learned from it. Nondeterministic algorithms and quantum algorithms have both been active areas of research, for example, even though neither is actually implementable today.

EDIT: I had misunderstood the mechanism of a centrifuge and it appears that it does a comparison, a massively-parallel one at that. However there are physical processes that operate on a property of the entity being sorted rather than comparing two properties. This answer covers algorithms that are of that nature.

A centrifuge applies a sorting mechanism that doesn't really work by means of comparisons between elements, but actually by a property ('centrifugal force') on each individual element in isolation.Some sorting algorithms fall into this theme, especially Radix Sort. When this sorting algorithm is parallelized it should approach the example of a centrifuge.

Some other non-comparative sorting algorithms are Bucket sort and Counting Sort. You may find that Bucket sort also fits into the general idea of a centrifuge (the radius could correspond to a bin).

Another so-called 'sorting algorithm' where each element is considered in isolation is the Sleep Sort. Here time rather than the centrifugal force acts as the magnitude used for sorting.

IMHO, people overthink log(n). O(nlog(n)) IS practically O(n). And you need O(n) just to read the data.

Many algorithms such as quicksort do provide a very fast way to sort elements. You could implement variations of quicksort that would be very fast in practice.

Inherently all physical systems are infinitely parallel. You might have a buttload of atoms in a grain of sand, nature has enough computational power to figure out where each electron in each atom should be. So if you had enough computational resources (O(n) processors) you could sort n numbers in log(n) time.

From comments:

  1. Given a physical processor that has k number of elements, it can achieve a parallelness of at most O(k). If you process n numbers arbitrarily, it would still process it at a rate related to k. Also, you could formulate this problem physically. You could create n steel balls with weights proportional to the number you want to encode, which could be solved by a centrifuge in a theory. But here the amount of atoms you are using is proportional to n. Whereas in a standard case you have a limited number of atoms in a processor.

  2. Another way to think about this is, say you have a small processor attached to each number and each processor can communicate with its neighbors, you could sort all those numbers in O(log(n)) time.

The centrifuge is not sorting the nodes, it applies applies a force to them then they react in parallel to it. So if you were to implement a bubble sort where each node is moving itself in parallel up or down based on it's "density", you'd have a centrifuge implementation.

Keep in mind that in the real world you can run a very large amount of parallel tasks where in a computer you can have a maximum of real parallel tasks equals to the number of physical processing units.

In the end, you would also be limited with the access to the list of elements because it cannot be modified simultaneously by two nodes...

Sorting is still O(n) total time. That it is faster than that is because of Parallelization.

You could view a centrifuge as a Bucketsort of n atoms, parallelized over n cores(each atom acts as a processor).

You can make sorting faster by parallelization but only by a constant factor because the number of processors is limited, O(n/C) is still O(n) (CPUs have usually < 10 cores and GPUs < 6000)

I worked in an office summers after high school when I started college. I had studied in AP Computer Science, among other things, sorting and searching.

I applied this knowledge in several physical systems that I can recall:

Natural merge sort to start…

A system printed multipart forms including a file-card-sized tear off, which needed to be filed in a bank of drawers.

I started with a pile of them and sorted the pile to begin with. The first step is picking up 5 or so, few enough to be easily placed in order in your hand. Place the sorted packet down, criss-crossing each stack to keep them separate.

Then, merge each pair of stacks, producing a larger stack. Repeat until there is only one stack.

…Insertion sort to complete

It is easier to file the sorted cards, as each next one is a little farther down the same open drawer.

Radix sort

This one nobody else understood how I did it so fast, despite repeated tries to teach it.

A large box of check stubs (the size of punch cards) needs to be sorted. It looks like playing solitaire on a large table—deal out, stack up, repeat.

In general

30 years ago, I did notice what you’re asking about: the ideas transfer to physical systems quite directly because there are relative costs of comparisons and handling records, and levels of caching.

Going beyond well-understood equivalents

I recall an essay about your topic, and it brought up the spaghetti sort. You trim a length of dried noodle to indicate the key value, and label it with the record ID. This is O(n), simply processing each item once.

Then you grab the bundle and tap one end on the table. They align on the bottom edges, and they are now sorted. You can trivially take off the longest one, and repeat. The read-out is also O(n).

There are two things going on here in the “real world” that don’t correspond to algorithms. First, aligning the edges is a parallel operation. Every data item is also a processor (the laws of physics apply to it). So, in general, you scale the available processing with n, essentially dividing your classic complexity by a factor on n.

Second, how does aligning the edges accomplish a sort? The real sorting is in the read-out which lets you find the longest in one step, even though you did compare all of them to find the longest. Again, divide by a factor of n, so finding the largest is now O(1).

Another example is using analog computing: a physical model solves the problem “instantly” and the prep work is O(n). In principle the computation is scaling with the number of interacting components, not the number of prepped items. So the computation scales with n². The example I'm thinking of is a weighted multi-factor computation, which was done by drilling holes in a map, hanging weights from strings passing through the holes, and gathering all the strings on a ring.

Another perspective is that what you're describing with the centrifuge is analogous to what's been called the "spaghetti sort" (https://en.wikipedia.org/wiki/Spaghetti_sort). Say you have a box of uncooked spaghetti rods of varying lengths. Hold them in your fist, and loosen your hand to lower them vertically so the ends are all resting on a horizontal table. Boom! They're sorted by height. O(constant) time. (Or O(n) if you include picking the rods out by height and putting them in a . . . spaghetti rack, I guess?)

You can note there that it's O(constant) in the number of pieces of spaghetti, but, due to the finite speed of sound in spaghetti, it's O(n) in the length of the longest strand. So nothing comes for free.

First of all, you are comparing two different contexts, one is logic(computer) and the other is physics which (so far) is proven that we can model some parts of it using mathematical formulas and we as programmers can use this formulas to simulate (some parts of) physics in the logic work (e.g physics engine in game engine).

Second We have some possibilities in the computer (logic) world that is nearly impossible in physics for example we can access memory and find the exact location of each entity at each time but in physics that is a huge problem Heisenberg's uncertainty principle.

Third If you want to map centrifuges and its operation in real world, to computer world, it is like someone (The God) has given you a super-computer with all the rules of physics applied and you are doing your small sorting in it (using centrifuge) and by saying that your sorting problem was solved in o(n) you are ignoring the huge physics simulation going on in background...

Consider: is "centrifuge sort" really scaling better? Think about what happens as you scale up.

  • The test tubes have to get longer and longer.
  • The heavy stuff has to travel further and further to get to the bottom.
  • The moment of inertia increases, requiring more power and longer times to accelerate up to sorting speed.

It's also worth considering other problems with centrifuge sort. For example, you can only operate on a narrow size scale. A computer sorting algorithm can handle integers from 1 to 2^1024 and beyond, no sweat. Put something that weighs 2^1024 times as much as a hydrogen atom into a centrifuge and, well, that's a black hole and the galaxy has been destroyed. The algorithm failed.

Of course the real answer here is that computational complexity is relative to some computational model, as mentioned in other answer. And "centrifuge sort" doesn't make sense in the context of common computational models, such as the RAM model or the IO model or multitape Turing machines.