# sorting algorithm

46 results back to index

pages: 120 words: 17,504

Data Structures & Algorithms Interview Questions You'll Most Likely Be Asked by Vibrant Publishers

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Provide its recursive description: Sorting Algorithms 176: Which are the sorting algorithms categories? 177: Describe on short an insertion sorting algorithm. 178: Which are the advantages provided by insertion sort? 179: What search algorithm does straight insertion sorting algorithm use? 180: What is the difference between insertion and exchange sorting? 181: Give examples of exchange sorting algorithms: 182: Shortly describe the quicksort algorithm. 183: Describe several variations of quicksort algorithm. 184: What is the difference between selection and insertion sorting? 185: What is merge sorting? 186: Which are the main steps of a merge sorting algorithm? 187: What is distribution sorting? Give example of several distribution sorting algorithms. 188: Describe the adaptive heap sort algorithm. 189: Describe the counting sort algorithm. 190: Describe Burstsort algorithm.

Answer: It is a sorting algorithm which has the unique characteristic that it does not make use of comparisons to do the sorting. Instead, distribution sorting algorithms rely on a priori knowledge about the universal set from which the elements to be sorted are drawn. The most well-known distribution sorting algorithms are bucket sort and radix sort. 188: Describe the adaptive heap sort algorithm. Answer: It is a variant of heapsort that uses a randomized binary search tree to structure the input according to any preexisting order. The randomized binary search tree is used to select candidates that are put into the heap so that the heap doesn't need to keep track of all elements. 189: Describe the counting sort algorithm. Answer: It is a 2-pass sort algorithm that is efficient when the range of keys is small and there many duplicate keys.

O(n2) algorithms such as selection sort or bubble sort; the best case (nearly sorted input) is O(n) e) stable - does not change the relative order of elements with equal keys f) in-place - only requires a constant amount O( 1) of additional memory space g) online - can sort a list as it receives it 179: What search algorithm does straight insertion sorting algorithm use? Answer: Straight insertion sorting uses a linear search algorithm to locate the position at which the next element is to be inserted. 180: What is the difference between insertion and exchange sorting? Answer: In insertion sorting algorithms, insertion is performed into a sorted list. On the other hand, an exchange sort does not necessarily make use of such a sorted list. 181: Give examples of exchange sorting algorithms: Answer: The most well known exchange sorting algorithms are the following: a) Bubble Sort - simple sorting algorithm that works by repeatedly stepping through the list to be sorted, comparing each pair of adjacent items and swapping them if they are in the wrong order b) Quicksort - a divide-and-conquer style algorithm which divides the original list into two sub-lists and sorts recursively each list 182: Shortly describe the quicksort algorithm.

Algorithms Unlocked by Thomas H. Cormen

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

The lower bound on comparison sorting Now that you have some idea about how the rules of the game may vary, let’s see a lower bound on how fast we can sort. We define a comparison sort as any sorting algorithm that determines the sorted order only by comparing pairs of elements. The four sorting algorithms from the previous chapter are comparison sorts, but R EALLY-S IMPLE -S ORT is not. 62 Chapter 4: A Lower Bound for Sorting and How to Beat It Here’s the lower bound: In the worst case, any comparison sorting algorithm for n elements requires .n lg n/ comparisons between pairs of elements. Recall that -notation gives a lower bound, so that what we’re saying is “for sufficiently large n, any comparison sorting algorithm requires at least cn lg n comparisons in the worst case, for some constant c.” Since each comparison takes at least constant time, that gives us an .n lg n/ lower bound on the time to sort n elements, assuming that we are using a comparison sorting algorithm.

I meant .n/ time, since it makes sense that we have to examine each element, even if we’re not comparing pairs of elements. The second important thing is truly remarkable: this lower bound does not depend on the particular algorithm, as long as it’s a comparison sorting algorithm. The lower bound applies to every comparison sorting algorithm, no matter how simple or complex. The lower bound applies to comparison sorting algorithms that have already been invented or will be invented in the future. It even applies to comparison sorting algorithms that will never be discovered by mankind! Beating the lower bound with counting sort We’ve already seen how to beat the lower bound in a highly restricted setting: there are only two possible values for the sort keys, and each element consists of only a sort key, with no satellite data.

In our bookshelf example, the key is just the author’s name, rather than a combination based first on the author’s name and then the title in case of two works by the same author. How, then, do we get the array to be sorted in the first place? In this chapter, we’ll see four algorithms—selection sort, insertion sort, merge sort, and quicksort—to sort an array, applying each of these algorithms to our bookshelf example. Each sorting algorithm will have its advantages and its disadvantages, and at the end of the chapter we’ll review and compare these sorting algorithms. All of the sorting algorithms that we’ll see in this chapter take either ‚.n2 / or ‚.n lg n/ time in the worst case. Therefore, if you were going to perform only a few searches, you’d be better off just running linear search. But if you were going to search many times, you might be better off first sorting the array and then searching by running binary search.

pages: 752 words: 131,533

Python for Data Analysis by Wes McKinney

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Suppose we wanted to sort some data identified by first and last names: In [182]: first_name = np.array(['Bob', 'Jane', 'Steve', 'Bill', 'Barbara']) In [183]: last_name = np.array(['Jones', 'Arnold', 'Arnold', 'Jones', 'Walters']) In [184]: sorter = np.lexsort((first_name, last_name)) In [185]: zip(last_name[sorter], first_name[sorter]) Out[185]: [('Arnold', 'Jane'), ('Arnold', 'Steve'), ('Jones', 'Bill'), ('Jones', 'Bob'), ('Walters', 'Barbara')] lexsort can be a bit confusing the first time you use it because the order in which the keys are used to order the data starts with the last array passed. As you can see, last_name was used before first_name. Note pandas methods like Series’s and DataFrame’s sort_index methods and the Series order method are implemented with variants of these functions (which also must take into account missing values) Alternate Sort Algorithms A stable sorting algorithm preserves the relative position of equal elements. This can be especially important in indirect sorts where the relative ordering is meaningful: In [186]: values = np.array(['2:first', '2:second', '1:first', '1:second', '1:third']) In [187]: key = np.array([2, 2, 1, 1, 1]) In [188]: indexer = key.argsort(kind='mergesort') In [189]: indexer Out[189]: array([2, 3, 4, 0, 1]) In [190]: values.take(indexer) Out[190]: array(['1:first', '1:second', '1:third', '2:first', '2:second'], dtype='|S8') The only stable sort available is mergesort which has guaranteed O(n log n) performance (for complexity buffs), but its performance is on average worse than the default quicksort method.

pages: 474 words: 91,222

Effective STL: 50 Specific Ways to Improve Your Use of the Standard Template Library by Scott Meyers

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

That’s not unreasonable. If you ask for the 20 best Widgets and some Widgets are equally good, you’re in no position to complain as long as the 20 you get back are at least as good as the ones you didn’t. For a full sort, you have slightly more control. Some sorting algorithms are stable. In a stable sort, if two elements in a range have equivalent values, their relative positions are unchanged after sorting. Hence, if Widget A precedes Widget B in the (unsorted) widgets vector and both have the same quality rating, a stable sorting algorithm will guarantee that after the vector is sorted, Widget A still precedes Widget B. An algorithm that is not stable would not make this guarantee. partial_sort is not stable. Neither is nth_element. sort, too, fails to offer stability, but there is an algorithm, stable_sort, that does what its name suggests.

Item 39: Make predicates pure functions I hate to do this to you, but we have to start with a short vocabulary lesson. • A predicate is a function that returns bool (or something that can be implicitly converted to bool). Predicates are widely used in the STL. The comparison functions for the standard associative containers are predicates, and predicate functions are commonly passed as parameters to algorithms like find_if and the various sorting algorithms. (For an overview of the sorting algorithms, turn to Item 31.) • A pure function is a function whose return value depends only on its parameters. If f is a pure function and x and y are objects, the return value of f(x, y) can change only if the value of x or y changes. In C++, all data consulted by pure functions are either passed in as parameters or are constant for the life of the function. (Naturally, such constant data should be declared const.)

pages: 247 words: 43,430

Think Complexity by Allen B. Downey

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

We will see how they work in Hashtables. Example 3-2. Read the Wikipedia page on sorting algorithms at http://en.wikipedia.org/wiki/Sorting_algorithm, and answer the following questions: What is a “comparison sort”? What is the best worst-case order of growth for a comparison sort? What is the best worst-case order of growth for any sort algorithm? What is the order of growth of bubble sort, and why does Barack Obama think it is “the wrong way to go”? What is the order of growth of radix sort? What preconditions do we need to use it? What is a stable sort, and why might it matter in practice? What is the worst sorting algorithm (that has a name)? What sort algorithm does the C library use? What sort algorithm does Python use? Are these algorithms stable? (You might have to Google around to find these answers.)

The general solution to this problem is to specify a machine model and analyze the number of steps (or operations) an algorithm requires under a given model. Relative performance might depend on the details of the dataset. For example, some sorting algorithms run faster if the data are already partially sorted; other algorithms run slower in this case. A common way to avoid this problem is to analyze the worst case scenario. It is also sometimes useful to analyze average case performance, but it is usually harder, and sometimes it is not clear what set of cases to average. Relative performance also depends on the size of the problem. A sorting algorithm that is fast for small lists might be slow for long lists. The usual solution to this problem is to express runtime (or number of operations) as a function of problem size and to compare the functions asymptotically as the problem size increases.

* * * [4] But if you get a question like this in an interview, I think a better answer is, “The fastest way to sort a million integers is to use whatever sort function is provided by the language I’m using. Its performance is good enough for the vast majority of applications, but if it turned out that my application was too slow, I would use a profiler to see where the time was being spent. If it looked like a faster sort algorithm would have a significant effect on performance, then I would look around for a good implementation of radix sort.” Analysis of Basic Python Operations Most arithmetic operations are constant time; multiplication usually takes longer than addition and subtraction, and division takes even longer, but these runtimes don’t depend on the magnitude of the operands. Very large integers are an exception; in that case, the runtime increases linearly with the number of digits.

pages: 554 words: 108,035

Scala in Depth by Tom Kleenex, Joshua Suereth

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

This can be further extended to encode significantly complex type dependent algorithms and type level programming. 7.4. Conditional execution using the type system There comes a time in an algorithm’s life when it needs to do something rather clever. This clever behavior encodes portions of the algorithm into the type system so that it can execute at compile time. An example of this could be a sort algorithm. The sort algorithm can be written against the raw Iterator interface. But if I call sort against a vector, then I’d like to be able to utilize vector’s natural array separation in my sorting algorithm. Traditionally this has been solved with two mechanisms: overloading and overriding. Using overloading, the sort method is implemented in terms of Iterable and another is implemented in terms of Vector. The downside to overloading is that it prevents you from using named/default parameters, and it can suffer at compile time due to type erasure.

But when developing new algorithms outside the collections library, type classes can come to the rescue. 8.5.1. Optimizing algorithms for each collections type You can use the type class paradigm to encode an algorithm against collections and refine that algorithm when speed improvements are possible. Let’s start by converting the generic sort algorithm from before into a type class paradigm. First we’ll define the type class for the sort algorithm. trait Sortable[A] { def sort(a : A) : A } The Sortable type class is defined against the type parameter A. The type parameter A is meant to be the full type of a collection. For example, sorting a list of integers would require a Sortable[List[Int]] object. The sort method takes a value of type A and returns a sorted version of type A. The generic sort method can now be modified to look as follows: object Sorter { def sort[Col](col : Col)(implicit s : Sortable[Col]) = s.sort(col) } The Sorter object defines a single method sort.

This syntax uses the _ keyword as a placeholder for a function argument. If more than one placeholder is used, each consecutive placeholder refers to consecutive arguments to the function literal. This notation is usually reserved for simple functions, such as the less-than (<) comparison in our Quicksort. We can apply this notation paired with operator notation to achieve the following on our quick sort algorithm: Scala offers syntactic shortcuts for simple cases, and it provides a mechanism to bend the type system via implicits conversions and implicits arguments. 1.2.4. Implicits are an old concept Scala implicits are a new take on an old concept. The first time I was ever introduced to the concept of implicit conversions was with primitive types in C++. C++ allows primitive types to be automatically converted as long as there is no loss of precision.

pages: 523 words: 143,139

Algorithms to Live By: The Computer Science of Human Decisions by Brian Christian, Tom Griffiths

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

See also DeDeo, Krakauer, and Flack, “Evidence of Strategic Periodicities in Collective Conflict Dynamics”; Daniels, Krakauer, and Flack, “Sparse Code of Conflict in a Primate Society”; Brush, Krakauer, and Flack, “A Family of Algorithms for Computing Consensus About Node State from Network Data.” For a broader overview of Flack’s work, see Flack, “Life’s Information Hierarchy.” This sporting contest is the marathon: The marathon has an analogue in the world of sorting algorithms. One of the more intriguing (Wikipedia used the word “esoteric” before the article was removed entirely) developments in beyond-comparison sorting theory arose from one of the most unlikely places: the notorious Internet message board 4chan. In early 2011, an anonymous post there proclaimed: “Man, am I a genius. Check out this sorting algorithm I just invented.” The poster’s “sorting algorithm”—Sleep Sort—creates a processing thread for each unsorted item, telling each thread to “sleep” the number of seconds of its value, and then “wake up” and output itself. The final output should, indeed, be sorted.

Each pass through the cards doubles the size of the sorted stacks, so to completely sort n cards you’ll need to make as many passes as it takes for the number 2, multiplied by itself, to equal n: the base-two logarithm, in other words. You can sort up to four cards in two collation passes, up to eight cards with a third pass, and up to sixteen cards with a fourth. Mergesort’s divide-and-conquer approach inspired a host of other linearithmic sorting algorithms that quickly followed on its heels. And to say that linearithmic complexity is an improvement on quadratic complexity is a titanic understatement. In the case of sorting, say, a census-level number of items, it’s the difference between making twenty-nine passes through your data set … and three hundred million. No wonder it’s the method of choice for large-scale industrial sorting problems.

Jordan aims to get a group of 25 or so books onto his cart before putting them in final order, which he does using an Insertion Sort. And his carefully developed strategy is exactly the right way to get there: a Bucket Sort, with his well-informed forecast of how many books he’ll have with various call numbers telling him what his buckets should be. Sort Is Prophylaxis for Search Knowing all these sorting algorithms should come in handy next time you decide to alphabetize your bookshelf. Like President Obama, you’ll know not to use Bubble Sort. Instead, a good strategy—ratified by human and machine librarians alike—is to Bucket Sort until you get down to small enough piles that Insertion Sort is reasonable, or to have a Mergesort pizza party. But if you actually asked a computer scientist to help implement this process, the first question they would ask is whether you should be sorting at all.

pages: 509 words: 92,141

The Pragmatic Programmer by Andrew Hunt, Dave Thomas

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

If a simple loop runs from 1 to n, then the algorithm is likely to be O(n)—time increases linearly with n. Examples include exhaustive searches, finding the maximum value in an array, and generating checksums. Nested loops. If you nest a loop inside another, then your algorithm becomes O(m x n), where m and n are the two loops' limits. This commonly occurs in simple sorting algorithms, such as bubble sort, where the outer loop scans each element in the array in turn, and the inner loop works out where to place that element in the sorted result. Such sorting algorithms tend to be O(n2). Binary chop. If your algorithm halves the set of things it considers each time around the loop, then it is likely to be logarithmic, O(lg(n)) (see Exercise 37, page 183). A binary search of a sorted list, traversing a binary tree, and finding the first set bit in a machine word can all be O(lg(n)).

The easiest way to do this is with assertions. In most C and C++ implementations, you'll find some form of assert or _assert macro that checks a Boolean condition. These macros can be invaluable. If a pointer passed in to your procedure should never be NULL, then check for it: void writeString(char *string) { assert(string != NULL); ... Assertions are also useful checks on an algorithm's operation. Maybe you've written a clever sort algorithm. Check that it works: for (int i = 0; i < num_entries-1; i++) { assert(sorted[i] <= sorted[i+1]); } Of course, the condition passed to an assertion should not have a side effect (see the box on page 124). Also remember that assertions may be turned off at compile time—never put code that must be executed into an assert. Don't use assertions in place of real error handling. Assertions check for things that should never happen: you don't want to be writing code such as printf("Enter 'Y' or 'N': "); ch = getchar(); assert((ch == 'Y') || (ch == 'N')); /* bad idea!

If you test a sort routine with random input keys, you may be surprised the first time it encounters ordered input. Pragmatic Programmers try to cover both the theoretical and practical bases. After all this estimating, the only timing that counts is the speed of your code, running in the production environment, with real data.[2] This leads to our next tip. [2] In fact, while testing the sort algorithms used as an exercise for this section on a 64MB Pentium, the authors ran out of real memory while running the radix sort with more than seven million numbers. The sort started using swap space, and times degraded dramatically. Tip 46 Test Your Estimates If it's tricky getting accurate timings, use code profilers to count the number of times the different steps in your algorithm get executed, and plot these figures against the size of the input.

pages: 263 words: 20,730

Exploring Python by Timothy Budd

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Finally the value 1 will be inserted. The result will be a new list into which every element has been inserted. Example: QuickSort The sorting algorithm quick sort provides another illustration of how a problem can be described as a transformation. Quick sort is a recursive algorithm that works by (a) Exploring Python – Chapter 8: Functional Programming 7 selecting some element, termed the pivot, (b) dividing the original list into three parts, namely those that are smaller than the pivot, those equal to the pivot, and those larger than the pivot, and (c) recursively sorting the first and third, appending the results to obtain the final solution. Once you have described the quick sort algorithm in this fashion, the solution is a simple transliteration: def quicksort(a): if a: # there are various ways of selecting the pivot # we simply choose the middle element pivot = a[len(a)/2] return (quickSort([x for x in a if x < pivot]) + [x for x in a if x == pivot] + quickSort([x for x in a if x > pivot])) else: return [ ] We have illustrated higher order functions by passing lambda expressions to functions such as filter and map.

Example – File Sort As an example program presented earlier shows, it is easy to sort the lines of a file if your computer has sufficient memory to maintain the contents of the file in a list. Simply read the file into the list, sort the list, then write the list out to a new file. But what if you have a very large file, one that is too big to fit into memory? The algorithm used to solve this problem is known as file sort. The file sort algorithm uses a number of temporary files as intermediate storage areas. The approach works in three steps. In step 1, the original file is read in small units, say 100 lines at a time. Each unit is sorted and written out to a temporary file. Once these have been created the second step begins. In this step pairs of temporary files are merged into a new file. To merge two Exploring Python – Chapter 6 – Files 6 files requires only one line at a time from each, and so memory size is not a problem.

Once you have described the quick sort algorithm in this fashion, the solution is a simple transliteration: def quicksort(a): if a: # there are various ways of selecting the pivot # we simply choose the middle element pivot = a[len(a)/2] return (quickSort([x for x in a if x < pivot]) + [x for x in a if x == pivot] + quickSort([x for x in a if x > pivot])) else: return [ ] We have illustrated higher order functions by passing lambda expressions to functions such as filter and map. The flip side is to write a function that accepts a function as argument. For example, you might want a sorting function that allows the user to provide the comparison test as an argument, rather than using the < operator. The quick sort algorithm rewritten to allow the comparison test to be passed as argument is as follows: def quicksort(a, cmp): if a: pivot = a[len(a)/2] return (quicksort([x for x in a if cmp(x, pivot)],cmp)+ [x for x in a if x == pivot] + quicksort([x for x in a if cmp(pivot, x)], cmp)) else: return [ ] This version of quicksort could be invoked as follows: >>> a = [1, 6, 4, 2, 5, 3, 7] >>> print quicksort(a, lambda x, y: x > y) [7, 6, 5, 4, 3, 2, 1] # sort backwards Simple Reductions Many common tasks can be implemented as a form of reduction.

pages: 238 words: 93,680

The C Programming Language by Brian W. Kernighan, Dennis M. Ritchie

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

We will illustrate this by modifying the sorting procedure written earlier in this chapter so that if the optional argument -n is given, it will sort the input lines numerically instead of lexicographically. A sort often consists of three parts - a comparison that determines the ordering of any pair of objects, an exchange that reverses their order, and a sorting algorithm that makes comparisons and exchanges until the objects are in order. The sorting algorithm is independent of the comparison and exchange operations, so by passing different comparison and exchange functions to it, we can arrange to sort by different criteria. This is the approach taken in our new sort. Lexicographic comparison of two lines is done by strcmp, as before; we will also need a routine numcmp that compares two lines on the basis of numeric value and returns the same kind of condition indication as strcmp does.

-1 : 1; if (s[i] == '+' || s[i] == '-') i++; for (n = 0; isdigit(s[i]); i++) n = 10 * n + (s[i] - '0'); return sign * n; /* skip white space */ /* skip sign */ } The standard library provides a more elaborate function strtol for conversion of strings to long integers; see Section 5 of Appendix B. The advantages of keeping loop control centralized are even more obvious when there are several nested loops. The following function is a Shell sort for sorting an array of integers. The basic idea of this sorting algorithm, which was invented in 1959 by D. L. Shell, is that in early stages, far-apart elements are compared, rather than adjacent ones as in simpler interchange sorts. This tends to eliminate large amounts of disorder quickly, so later stages have less work to do. The interval between compared elements is gradually decreased to one, at which point the sort effectively becomes an adjacent interchange method. /* shellsort: sort v[0]...v[n-1] into increasing order */ void shellsort(int v[], int n) { 58 int gap, i, j, temp; for (gap = n/2; gap > 0; gap /= 2) for (i = gap; i < n; i++) for (j=i-gap; j>=0 && v[j]>v[j+gap]; j-=gap) { temp = v[j]; v[j] = v[j+gap]; v[j+gap] = temp; } } There are three nested loops.

#include <stdio.h> /* printd: print n in decimal */ void printd(int n) { if (n < 0) { putchar('-'); n = -n; } if (n / 10) printd(n / 10); putchar(n % 10 + '0'); } When a function calls itself recursively, each invocation gets a fresh set of all the automatic variables, independent of the previous set. This in printd(123) the first printd receives the argument n = 123. It passes 12 to a second printd, which in turn passes 1 to a third. The third-level printd prints 1, then returns to the second level. That printd prints 2, then returns to the first level. That one prints 3 and terminates. Another good example of recursion is quicksort, a sorting algorithm developed by C.A.R. Hoare in 1962. Given an array, one element is chosen and the others partitioned in two subsets - those less than the partition element and those greater than or equal to it. The same process is then applied recursively to the two subsets. When a subset has fewer than two elements, it doesn't need any sorting; this stops the recursion. Our version of quicksort is not the fastest possible, but it's one of the simplest.

Pearls of Functional Algorithm Design by Richard Bird

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Suppose there are A(m, n) diﬀerent possible answers. Each test of f (x , y) against z has three possible outcomes, so the height h of the ternary tree of tests has to satisfy h ≥ log3 A(m, n). Provided we can estimate A(m, n) this gives us a lower bound on the number of tests that have to be performed. The situation is the same with sorting n items by binary comparisons; there are n! possible outcomes, so any sorting algorithm has to make at least log2 n! comparisons in the worst case. Improving on saddleback search 17 It is easy to estimate A(m, n): each list of pairs (x , y) in the range 0 ≤ x < n and 0 ≤ y < m with f (x , y) = z is in a one-to-one correspondence with a step shape from the top-left corner of the m × n rectangle to the bottom-right corner, in which the value z appears at the inner corners of the steps.

i ) (12.8) In words, one can partition a list of pairs by ﬁrst partitioning with respect to ﬁrst components and reﬁning the result using the second components. The correct second components can be installed in each run because each run is a list of positions. After installation, each run is partition sorted and the results concatenated. To be accurate, (12.7) holds only if psort is a stable sorting algorithm and the implementation in Figure 12.1 is not stable. If psort is not stable then the elements in each run will appear in a diﬀerent order in the leftand right-hand sides. But (12.7) does hold if we interpret equality of two partitions to mean equal up to some permutation of the elements in each run. Since the computation of ranktails does not depend on the precise order of the elements in a run, that is all that is required.

The second identity is takeCols (j +1) · hdsort = hdsort · takeCols (j +1) (13.4) In words, sorting an n × n matrix on its ﬁrst column and then taking a positive number of columns of the result yields exactly the same result as ﬁrst taking the same number of columns and then sorting on the ﬁrst column. The third identity, the key one, is not so obvious: hdsort · map rrot · sort · rots = sort · rots (13.5) In words, the following transformation on a matrix of sorted rotations is the identity: move the last column to the front and then resort the rows on the new ﬁrst column. In fact, (13.5) is true only if hdsort is a stable sorting algorithm, meaning that columns with the same ﬁrst element appear in the output in the same order that they appeared in the input. Under this assumption we have, applied to an n × n matrix, that sort = (hdsort · map rrot)n This identity states that one can sort an n × n matrix (in fact, an arbitrary list of lists all of which have length n) by repeating n times the operation of rotating the last column into ﬁrst position and then stably sorting according to the ﬁrst column only.

pages: 292 words: 62,575

97 Things Every Programmer Should Know by Kevlin Henney

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

When you're designing an application, be mindful of the number of interprocess communications in response to each stimulus. When analyzing applications that suffer from poor performance, I have often found IPC-to-stimulus ratios of thousands-to-one. Reducing this ratio, whether by caching or parallelizing or some other technique, will pay off much more than changing data structure choice or tweaking a sorting algorithm. * * * [6] http://martinfowler.com/eaaCatalog/lazyLoad.html Chapter 42. Keep the Build Clean Johannes Brodwall HAVE YOU EVER LOOKED AT a list of compiler warnings the length of an essay on bad coding and thought to yourself, "You know, I really should do something about that…but I don't have time just now"? On the other hand, have you ever looked at a lone warning that appeared in a compilation and just fixed it?

Test Precisely and Concretely Kevlin Henney IT IS IMPORTANT TO TEST for the desired, essential behavior of a unit of code, rather than for the incidental behavior of its particular implementation. But this should not be taken or mistaken as an excuse for vague tests. Tests need to be both accurate and precise. Something of a tried, tested, and testing classic, sorting routines offer an illustrative example. Implementing a sorting algorithm is not necessarily an everyday task for a programmer, but sorting is such a familiar idea that most people believe they know what to expect from it. This casual familiarity, however, can make it harder to see past certain assumptions. When programmers are asked, "What would you test for?", by far and away the most common response is something like, "The result of sorting is a sorted sequence of elements."

pages: 923 words: 516,602

The C++ Programming Language by Bjarne Stroustrup

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved. Section 13.4 Using Template Arguments to Specify Policy 339 like to use for a comparison? Two different collating sequences (numerical orderings of the characters) are commonly used for sorting Swedish names. Naturally, neither a general string type nor a general sort algorithm should know about the conventions for sorting names in Sweden. Therefore, any general solution requires that the sorting algorithm be expressed in general terms that can be defined not just for a specific type but also for a specific use of a specific type. For example, let us generalize the standard C library function ssttrrccm mpp() for SSttrriinnggs of any type T (§13.2): tteem mppllaattee<ccllaassss T T, ccllaassss C C> iinntt ccoom mppaarree(ccoonnsstt SSttrriinngg<T T>& ssttrr11, ccoonnsstt SSttrriinngg<T T>& ssttrr22) { ffoorr(iinntt ii=00; ii<ssttrr11.lleennggtthh() && ii< ssttrr22.lleennggtthh(); ii++) iiff (!

Modify the header files to declare all functions called and to declare the type of every argument. Where possible, replace #ddeeffiinnees with eennuum m, ccoonnsstt, or iinnlliinnee. Remove eexxtteerrnn declarations from .cc files and if necessary convert all function definitions to C++ function definition syntax. Replace calls of m maalllloocc() and ffrreeee() with nneew w and ddeelleettee. Remove unnecessary casts. 6. (∗2) Implement ssssoorrtt() (§7.7) using a more efficient sorting algorithm. Hint: qqssoorrtt(). 7. (∗2.5) Consider: ssttrruucctt T Tnnooddee { ssttrriinngg w woorrdd; iinntt ccoouunntt; T Tnnooddee* lleefftt; T Tnnooddee* rriigghhtt; }; The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T. Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved. 164 Functions Chapter 7 Write a function for entering new words into a tree of T Tnnooddees.

These techniques enable an implementer to hide The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T. Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved. 328 Templates Chapter 13 sophisticated implementations behind simple interfaces and to expose complexity to the user only when the user has a specific need for it. For example, ssoorrtt(vv) can be the interface to a variety of sort algorithms for elements of a variety of types held in a variety of containers. The sort function that is most appropriate for the particular v will be automatically chosen. Every major standard library abstraction is represented as a template (for example, ssttrriinngg, oossttrreeaam m, ccoom mpplleexx, lliisstt, and m maapp) and so are the key operations (for example, ssttrriinngg compare, the output operator <<, ccoom mpplleexx addition, getting the next element from a lliisstt, and ssoorrtt()).

pages: 375 words: 66,268

High Performance JavaScript by Nicholas C. Zakas

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Iteration Any algorithm that can be implemented using recursion can also be implemented using iteration. Iterative algorithms typically consist of several different loops performing different aspects of the process, and thus introduce their own performance issues. However, using optimized loops in place of long-running recursive functions can result in performance improvements due to the lower overhead of loops versus that of executing a function. As an example, the merge sort algorithm is most frequently implemented using recursion. A simple JavaScript implementation of merge sort is as follows: function merge(left, right){ var result = []; while (left.length > 0 && right.length > 0){ if (left[0] < right[0]){ result.push(left.shift()); } else { result.push(right.shift()); } } return result.concat(left).concat(right); } function mergeSort(items){ if (items.length == 1) { return items; } var middle = Math.floor(items.length / 2), left = items.slice(0, middle), 76 | Chapter 4: Algorithms and Flow Control right = items.slice(middle); return merge(mergeSort(left), mergeSort(right)); } The code for this merge sort is fairly simple and straightforward, but the mergeSort() function itself ends up getting called very frequently.

A simple JavaScript implementation of merge sort is as follows: function merge(left, right){ var result = []; while (left.length > 0 && right.length > 0){ if (left[0] < right[0]){ result.push(left.shift()); } else { result.push(right.shift()); } } return result.concat(left).concat(right); } function mergeSort(items){ if (items.length == 1) { return items; } var middle = Math.floor(items.length / 2), left = items.slice(0, middle), 76 | Chapter 4: Algorithms and Flow Control right = items.slice(middle); return merge(mergeSort(left), mergeSort(right)); } The code for this merge sort is fairly simple and straightforward, but the mergeSort() function itself ends up getting called very frequently. An array of n items ends up calling mergeSort() 2 * n –1 times, meaning that an array with more than 1,500 items would cause a stack overflow error in Firefox. Running into the stack overflow error doesn’t necessarily mean the entire algorithm has to change; it simply means that recursion isn’t the best implementation. The merge sort algorithm can also be implemented using iteration, such as: //uses the same mergeSort() function from previous example function mergeSort(items){ if (items.length == 1) { return items; } var work = []; for (var i=0, len=items.length; i < len; i++){ work.push([items[i]]); } work.push([]); //in case of odd number of items for (var lim=len; lim > 1; lim = (lim+1)/2){ for (var j=0,k=0; k < lim; j++, k+=2){ work[j] = merge(work[k], work[k+1]); } work[j] = []; //in case of odd number of items } return work[0]; } This implementation of mergeSort() does the same work as the previous one without using recursion.

pages: 236 words: 67,823

Hacking Vim 7.2 by Kim Schulz

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

You simply have to use the sort()function on the list you get out of the keys() function. let mydict = {a: "apple", b:"banana", c: "citrus" } for keyvar in sort(keys(mydict)) echo mydict[keyvar] endfor This, of course, requires that the names of the keys can be ordered individually by using a normal sort algorithm. In the previous case, there is no problem because a is before b, which is before c. The sort function can actually take another argument, which is a function name. This way you can make your own sort algorithm to use when sorting special values. See :help sort() for more information and an example. [ 165 ] Basic Vim Scripting While loops The next type of loop we will look at is the while loop. This type of loop, as the name indicates, runs for as long as some condition is true (remember how we previously defined what a condition is in the Conditions section).

pages: 297 words: 77,362

The Nature of Technology by W. Brian Arthur

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Viewed this way technology begins to acquire a “genetics.” Nothing equivalent to DNA or its cellular workings of course, or as beautifully ordered as this. But still, a rich interlinked ancestry. All this sounds organic—very organic—and indeed the view we will be led to is as much biological as mechanical. For sure, technologies are not biological organisms, they are mechanistic almost by definition: whether they are sorting algorithms or atomic clocks, they have parts that interact in predictable ways. But once we lay them out as combinations that can be combined into further combinations, we begin to see them not so much as individual pieces of clockwork but as complexes of working processes that interact with other complexes to form new ones. We see a world where the overall collective body of technology forms new elements—new technologies—from existing ones.

It pulls signals from the air, purifies them, and transforms them into sounds. It is a miniature extraction process, a tiny factory that these days can be held in the palm of a hand. All devices in fact process something. That, after all, is why economists refer to technologies as means of production. Does the correspondence work in the other direction? Can we view methods and processes as devices? The answer is yes. Processes and methods—think of oil refining or sorting algorithms—are sequences of operations. But to execute, they always require some hardware, some sort of physical equipment that carries out the operations. We could see this physical equipment as forming a “device” executing this sequence of operations. In the case of oil refining this would be a pretty big device. But the point is valid. Processes are devices if we include the equipment that executes them.

pages: 410 words: 119,823

In the United States, for example, the Federal Trade Commission’s inventory of Equal Credit Opportunity Rights explicitly “prohibits credit discrimination on the basis of race, color, religion, national origin, sex, marital status, age, or because you get public assistance,” and this is intended to protect certain classes of people who have historically been denied access to financing.67 Without access to an algorithm, there is no way of knowing whether it observes those provisions—or, perhaps more worryingly, whether the behaviors it weighs transparently serve as proxies for factors that lenders are specifically forbidden by law to consider in their provision of credit. And finally, without access to its composition, we can’t reconstruct whether the conclusions an algorithm arrives at bear even the slightest relationship to someone’s actual propensity to repay a loan. Like any other sorting algorithm, the ones used in the determination of creditworthiness always direct our attention to a subset of the information that is available. That information may have less bearing on someone’s trustworthiness than other facts which might well be more salient, but which by their nature are less accessible to the lender. The mathematician and alternative-banking activist Cathy O’Neil has documented, for example, that lenders systematically refuse credit to borrowers on the basis of “signals more correlated to being uneducated and/or poor than to the willingness or ability to pay back loans,” and these signals can be as arbitrary as the fact that they exclusively used capital letters in filling out their loan application.68 There might very well be other information that casts a specific individual’s reliability in a much better light, but simply isn’t available to the lender in numerical form, or available at all.

Conversely, as we cannot even in principle specify ahead of time what kinds of correlations might emerge from the analysis of a sufficiently large data set, the only way to prevent all such correlations from being used with discriminatory intent is to ban data capture in the first place—and that’s obviously off the table in any technologically advanced society. As Oxford researchers Bryce Goodman and Seth Flaxman point out, then, the EU regulation is either too narrowly written to be effective, or so broadly interpretable as to be unenforceable. This suggests that it isn’t so much the obscurity of any specific algorithm that presents would-be regulators with their greatest challenge, but the larger obscurity of the way in which sorting algorithms work. And this impression is reinforced by the law’s second major provision, which aims directly at the question of algorithmic opacity. Its Articles 12 and 13 create “the right to an explanation,” requiring that anyone affected by the execution of some algorithmic system be offered the means to understand exactly how it arrived at its judgment, in a “concise, intelligible and easily accessible form, using clear and plain language.”

This clearly relies entirely too much on the initiative, the bravery and the energy of the individual, and fails to account for those situations, and they will be many, in which that individual is not offered any meaningful choice of action. Furthermore, this sort of accountability is ill-suited to the time scale in which algorithmic decisions take place—which is to say, in real time. Explanation and redress are by definition reactive and ex post facto. The ordinary operation of a sorting algorithm will generally create a new set of facts on the ground,72 setting new chains of cause and effect in motion; these will reshape the world, in ways that are difficult if not impossible to reverse, long before anyone is able to secure an explanation. It’s evident that the authors of this well-intended regulation either haven’t quite understood how algorithms achieve their effects, or have failed to come up with language that might meaningfully constrain how they operate.

pages: 194 words: 36,223

Smart and Gets Things Done: Joel Spolsky's Concise Guide to Finding the Best Technical Talent by Joel Spolsky

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Joel Spolsky, “The Perils of JavaSchools,” published at www.joelonsoftware.com on December 29, 2005 (search for “JavaSchools”). The Guerrilla Guide to Interviewing 111 A lot of programmers whom you might interview these days are apt to consider recursion, pointers, and even data structures to be a silly implementation detail that has been abstracted away by today’s many happy programming languages. “When was the last time you had to write a sorting algorithm?” they snicker. Still, I don’t really care. I want my ER doctor to understand anatomy, even if all she has to do is put the computerized defibrillator nodes on my chest and push the big red button, and I want programmers to know programming down to the CPU level, even if Ruby on Rails does read your mind and build a complete Web 2.0 social collaborative networking site for you with three clicks of the mouse.

pages: 197 words: 35,256

NumPy Cookbook by Ivan Idris

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

This is especially true for non-trivial software. The good news is that there are lots of tools to help you. We will review a number of techniques that are popular amongst NumPy users. Profiling with timeit timeit is a module that allows you to time pieces of code. It is part of the standard Python library. We will time the NumPy sort function with several different array sizes. The classic quicksort and merge sort algorithms have an average running time of O(nlogn); so we will try to fit our result to such a model. How to do it... We will require arrays to sort. Create arrays to sort.We will create arrays of varying sizes containing random integer values: times = numpy.array([]) for size in sizes: integers = numpy.random.random_integers(1, 10 ** 6, size) Measure time.In order to measure time, we need to create a timer and give it a function to execute and specify the relevant imports.

pages: 1,758 words: 342,766

Code Complete (Developer Best Practices) by Steve McConnell

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Recursion is usually called into play when a small part of the problem is easy to solve and a large part is easy to decompose into smaller pieces. Recursion isn't useful often, but when used judiciously it produces elegant solutions, as in this example in which a sorting algorithm makes excellent use of recursion: Java Example of a Sorting Algorithm That Uses Recursion void QuickSort( int firstIndex, int lastIndex, String [] names ) { if ( lastIndex > firstIndex ) { int midPoint = Partition( firstIndex, lastIndex, names ); QuickSort( firstIndex, midPoint-1, names ); <-- 1 QuickSort( midPoint+1, lastIndex, names ) <-- 1 } } (1)Here are the recursive calls. In this case, the sorting algorithm chops an array in two and then calls itself to sort each half of the array. When it calls itself with a subarray that's too small to sortsuch as ( lastIndex <= firstIndex )it stops calling itself.

Many programmers never work above this level of abstraction, which makes their lives much harder than they need to be. Level 2: Low-Level Implementation Structures Low-level implementation structures are slightly higher-level structures than those provided by the language itself. They tend to be the operations and data types you learn about in college courses in algorithms and data types: stacks, queues, linked lists, trees, indexed files, sequential files, sort algorithms, search algorithms, and so on. If your program consists entirely of code written at this level, you'll be awash in too much detail to win the battle against complexity. Level 3: Low-Level Problem-Domain Terms At this level, you have the primitives you need to work in terms of the problem domain. It's a glue layer between the computer-science structures below and the highlevel problem-domain code above.

pages: 259 words: 67,456

The Mythical Man-Month by Brooks, Jr. Frederick P.

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Then he can begin designing module boundaries, table structures, pass or phase breakdowns, algorithms, and all kinds of tools. Some time, too, must be spent in communicating with the architect. Meanwhile, on the realization level there is much to be done also. Programming has a technology, too. If the machine is a new one, much work must be done on subroutine conventions, supervisory techniques, searching and sorting algorithms.7 Conceptual integrity does require that a system reflect a single philosophy and that the specification as seen by the user flow from a few minds. Because of the real division of labor into architecture, implementation, and realization, however, this does not imply that a system so designed will take longer to build. Experience shows the opposite, that the integral system goes together faster and takes less time to test.

pages: 231 words: 71,248

Shipping Greatness by Chris Vander Mey

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

We had the idea that we could support a Netflix-style review ratings system for people like you, so that we can give you the reviews that are most relevant. We figured it was best to put that in the API. You may want to provide less detail than I show in this API. But if you’re building anything developer facing, even if those developers are other engineers within your business, it’s worth considering digging deep so that you don’t expose a problem later. For example, if Reviews were assuming we’d give the content a letter grade, its sort algorithm might be completely different! Step 6. Write the Functional Specifications Document You’ve crafted a product idea that solves a real need for a real group of customers. You’ve pitched and received buy-in from your essential stakeholders. You’ve worked with your dependencies to define how you’ll interface. It’s now time to get into the implementation details and build your big document.

pages: 250 words: 73,574

Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers by John MacCormick, Chris Bishop

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Instead, the tricks in this book resemble tricks of the trade or even magic tricks: clever techniques for accomplishing goals that would otherwise be difficult or impossible. The first criterion—everyday use by ordinary computer users—eliminates algorithms used primarily by computer professionals, such as compilers and program verification techniques. The second criterion—concrete application to a specific problem—eliminates many of the great algorithms that are central to the undergraduate computer science curriculum. This includes sorting algorithms like quicksort, graph algorithms such as Dijkstra's shortest-path algorithm, and data structures such as hash tables. These algorithms are indisputably great and they easily meet the first criterion, since most application programs run by ordinary users employ them repeatedly. But these algorithms are generic: they can be applied to a vast array of different problems. In this book, I have chosen to focus on algorithms for specific problems, since they have a clearer motivation for ordinary computer users.

Big Data at Work: Dispelling the Myths, Uncovering the Opportunities by Thomas H. Davenport

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

One way to ­characterize an important set of differences between types has been coined by Vincent Granville, who operates Data Science Central, Chapter_04.indd 97 03/12/13 12:00 PM 98  big data @ work a social n ­ etwork for data scientists like himself. In a blog post (with some ­analytical jargon you can toss around at cocktail parties), he described the difference between vertical and horizontal data scientists: • Vertical data scientists have very deep knowledge in some ­narrow field. They might be computer scientists very ­familiar with computational complexity of all sorting ­algorithms. Or a statistician who knows everything about ­eigenvalues, singular value decomposition and its numerical stability, and asymptotic convergence of maximum ­pseudo-likelihood estimators. Or a software engineer with years of experience writing Python code (including graphic libraries) applied to API development and web crawling technology. Or a ­database guy with strong data modeling, data warehousing, graph d ­ atabases, Hadoop and NoSQL expertise.

pages: 260 words: 77,007

Are You Smart Enough to Work at Google?: Trick Questions, Zen-Like Riddles, Insanely Difficult Puzzles, and Other Devious Interviewing Techniques You ... Know to Get a Job Anywhere in the New Economy by William Poundstone

“We were offended at having four-digit numbers”: Auletta, Googled, 32. “A very senior Microsoft developer”: See www.joelonsoftware.com/items/2005/10/17.html. Tyma posed this question to his mother: See Tyma’s blog post at http://paultyma.blogspot.com/2007/03/howto-pass-silicon-valley-software.html. about twenty times faster than quicksort: With 1,000,000 records to sort, Mrs. Tyma’s method requires 1,000,000 operations. Quicksort, and other optimal sorting algorithms, require on the order of 1,000,000 log2 (1,000,000) operations. Taking this at face value, Mrs. Tyma’s method is approximately log2 (1,000,000), or 19.9 +, times faster. Chapter Six “At Google, we believe in collaboration”: Mohammad, blog post at http://allouh.wordpress.com/2009/04/14/interview-with-google/. “You will have this ‘lost in space feeling’”: December 30, 2006, comment by “Daniel” on Shmula blog, www.shmula.com/31/my-interview-job-offer-from-google.

The Art of Computer Programming: Fundamental Algorithms by Donald E. Knuth

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

For example, an algorithm to sort n numbers might be called inefficient "because its running time is O(n2)." But a running time of O(n2) does not necessarily imply that the running time is not also O(n). There's another notation, Big Omega, for lower bounds: The statement g(n) = n(f(n)) B0) means that there are positive constants L and no such that \g(ri)\ > L\f(n)\ for all n > n0. Using this notation we can correctly conclude that a sorting algorithm whose running time is Q(n2) will not be as efficient as one whose running time is O(nlogn), if n is large enough. However, without knowing the constant factors implied by O and Q,, we cannot say anything about how large n must be before the O(n log n) method will begin to win. Finally, if we want to state an exact order of growth without being precise about constant factors, we can use Big Theta notation: g(n) = e(f(n)) <=> g(n) = O(f(n)) and g(n) = Q(f(n)).

22. [23] Program T assumes that its input tape contains valid information, but a program that is intended for general use should always make careful tests on its input so that clerical errors can be detected, and so that the program cannot "destroy itself." For example, if one of the input relations for k were negative, Program T may erroneously change one of its own instructions when storing into X [A;]. Suggest ways to modify Program T so that it is suitable for general use. 2.2.3 LINKED ALLOCATION 271 > 23. [27] When the topological sort algorithm cannot proceed because it has detected a loop in the input (see step T8), it is usually of no use to stop and say, "There was a loop." It is helpful to print out one of the loops, thereby showing part of the input that was in error. Extend Algorithm T so that it will do this additional printing of a loop when necessary. [Hint: The text gives a proof for the existence of a loop when N > 0 in step T8; that proof suggests an algorithm.] 24. [24] Incorporate the extensions of Algorithm T made in exercise 23 into Pro- Program T. 25. [47] Design as efficient an algorithm as possible for doing a topological sort of very large sets 5 having considerably more nodes than the computer memory can contain.

[M20] Find the number of labeled oriented trees with n vertices by using deter- determinants and the result of exercise 2.3.4.2-19. (See also exercise 1.2.3-36.) 13. [15] What oriented tree on the vertices {1, 2,..., 10} has the canonical represen- representation 3, 1, 4, 1, 5, 9, 2, 6, 5? 14. [10] True or false: The last entry, /(Vn_i), in the canonical representation of an oriented tree is always the root of that tree. 15. [21] Discuss the relationships that exist (if any) between the topological sort algorithm of Section 2.2.3 and the canonical representation of an oriented tree. 16. [25] Design an algorithm (as efficient as possible) that converts from the canonical representation of an oriented tree to a conventional computer representation using PARENT links. > 17. [M26] Let f(x) be an integer-valued function, where 1 < f(x) < m for all integers 1 < x < m. Define x = y if f^(x) = f^(y) for some r, s > 0, where f^(x) = x and /'r+1'(x) = f{fix))- By using methods of enumeration like those in this section, show that the number of functions such that x = y for all x and y is mm~1Q(m), where Q(m) is the function defined in Section 1.2.11.3. 18. [24] Show that the following method is another way to define a one-to-one cor- correspondence between (n — l)-tuples of numbers from 1 to n and oriented trees with n labeled vertices: Let the leaves of the tree be V\,..., Vfc in ascending order.

pages: 999 words: 194,942

Clojure Programming by Chas Emerick, Brian Carper, Christophe Grand

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

[355] Other dog bark onomatopoeia courtesy of https://en.wikipedia.org/wiki/Bark_%28utterance%29#Representation. [356] Note that you can absolutely opt to use “legacy” dependency injection containers (like Spring, Guice, et al.) with Clojure types and records. Strategy Pattern Another common design pattern is the strategy pattern. This pattern allows the selection of a method or algorithm dynamically. Suppose we want to select a sorting algorithm at runtime: interface ISorter { public sort (int[] numbers); } class QuickSort implements ISorter { public sort (int[] numbers) { ... } } class MergeSort implements ISorter { public sort (int[] numbers) { ... } } class Sorter { private ISorter sorter; public Sorter (ISorter sorter) { this.sorter = sorter; } public execute (int[] numbers) { sorter.sort(numbers); } } class App { public ISorter chooseSorter () { if (...) { return new QuickSort(); } else { return new MergeSort(); } } public static void main(String[] args) { int[] numbers = {5,1,4,2,3}; Sorter s = new Sorter(chooseSorter()); s.execute(numbers); //... now use sorted numbers } } Clojure has a very simple advantage over Java in this case.

Translated literally, our Clojure code might look like this: (defn quicksort [numbers] ...) (defn mergesort [numbers] ...) (defn choose-sorter [] (if ... quicksort mergesort)) (defn main [] (let [numbers [...]] ((choose-sorter) numbers))) There are no classes in sight. Each function implementing the semantics of our algorithm can be called directly, without class definitions getting in the way of our specifying behavior. You don’t even have to give your sorting algorithm a name—an anonymous function works just as well. For example, the composition of Clojure’s built-in sort function and then reverse to reverse the sort order is an anonymous function: ((comp reverse sort) [2 1 3]) ;= (3 2 1) Chain of Responsibility While Clojure’s facilities make many patterns unnecessary or invisible, a select few remain relevant and continue to impact our design and implementation of Clojure programs.

Martin Kleppmann-Designing Data-Intensive Applications. The Big Ideas Behind Reliable, Scalable and Maintainable Systems-O’Reilly (2017) by Unknown

Byzantine (arbitrary) faults Nodes may do absolutely anything, including trying to trick and deceive other nodes, as described in the last section. Knowledge, Truth, and Lies | 307 For modeling real systems, the partially synchronous model with crash-recovery faults is generally the most useful model. But how do distributed algorithms cope with that model? Correctness of an algorithm To define what it means for an algorithm to be correct, we can describe its properties. For example, the output of a sorting algorithm has the property that for any two dis‐ tinct elements of the output list, the element further to the left is smaller than the ele‐ ment further to the right. That is simply a formal way of defining what it means for a list to be sorted. Similarly, we can write down the properties we want of a distributed algorithm to define what it means to be correct. For example, if we are generating fencing tokens for a lock (see “Fencing tokens” on page 303), we may require the algorithm to have the following properties: Uniqueness No two requests for a fencing token return the same value.

While the number of map tasks is determined by the number of input file blocks, the number of reduce tasks is configured by the job author (it can be different from the number of map tasks). To ensure that all key-value pairs with the same key end up at the same reducer, the framework uses a hash of the key to determine which reduce task should receive a particular key-value pair (see “Partitioning by Hash of Key” on page 203). The key-value pairs must be sorted, but the dataset is likely too large to be sorted with a conventional sorting algorithm on a single machine. Instead, the sorting is per‐ formed in stages. First, each map task partitions its output by reducer, based on the hash of the key. Each of these partitions is written to a sorted file on the mapper’s local disk, using a technique similar to what we discussed in “SSTables and LSMTrees” on page 76. MapReduce and Distributed Filesystems | 401 Whenever a mapper finishes reading its input file and writing its sorted output files, the MapReduce scheduler notifies the reducers that they can start fetching the output files from that mapper.

pages: 757 words: 193,541

The Practice of Cloud System Administration: DevOps and SRE Practices for Web Services, Volume 2 by Thomas A. Limoncelli, Strata R. Chalup, Christina J. Hogan

One might describe the increase in customers being attracted to your business as growing linearly or exponentially. The run-time of a system might be described as growing in similar terms. Super-linear systems sound awful compared to sub-linear systems. Why not always use algorithms that are constant or linear? The simplest reason is that often algorithms of that order don’t exist. Sorting algorithms have to touch every item at least once, eliminating the possibility of O(1) algorithms. There is one O(n) sort algorithm but it works on only certain kinds of data. Another reason is that faster algorithms often require additional work ahead of time: for example, building an index makes future searches faster but requires the overhead and complexity of building and maintaining the index. That effort of developing, testing, and maintaining such indexing code may not be worth it if the system’s performance is sufficient as is.

pages: 336 words: 88,320

Being Geek: The Software Developer's Career Handbook by Michael Lopp

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Provided I have a network connection, this tool magically refreshes a shared directory sitting on each of my machines. I can't think of the last time I worried about which version of a document I was on, and that means I'm spending more time working than worrying. My Tools Are Designed to Remove Repetitive Motion One of my first algorithmic holy shits was during my second computer science class as we were learning sorting algorithms. The professor elegantly walked us through the construction of different algorithms, explaining the pros and the cons, and then he landed Quicksort. Holy shit. It wasn't just the elegance. It wasn't the recursive simplicity; it was the discovery that with imagination there were approaches that were wildly more efficient—and simpler. Whether you're formally trained as a computer science nerd or not, you've learned the value of efficiency—to make each action that you take mean something.

Functional Programming in Scala by Paul Chiusano, Rúnar Bjarnason

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Therefore, even though it's possible to abuse the wildcard type to get the naked STRef out, this is still safe since we can't use it to mutate or access the state. 14.2.4 Mutable arrays Mutable references on their own are not all that useful. A more useful application of mutable state is arrays. In this section we will define an algebra for manipulating mutable arrays in the ST monad and then write an in-place QuickSort algorithm compositionally. We will need primitive combinators to allocate, read, and write mutable arrays: sealed abstract class STArray[S,A](implicit manifest: Manifest[A]) { www.it-ebooks.info 259 protected def value: Array[A] def size: ST[S,Int] = ST(value.size) def write(i: Int, a: A): ST[S,Unit] = new ST[S,Unit] { def run(s: S) = { value(i) = a ((), s) } } def read(i: Int): ST[S,A] = ST(value(i)) def freeze: ST[S,List[A]] = ST(value.toList) } object STArray { def apply[S,A:Manifest](sz: Int, v: A): ST[S, STArray[S,A]] = new STArray[S,A] { lazy val value = Array.fill(sz)(v) } } Scala requires an implicit Manifest for constructing arrays.

Refactoring: Improving the Design of Existing Code by Martin Fowler, Kent Beck, John Brant, William Opdyke, Don Roberts

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Keep using Move Method to move behavior to the classes until the protocols are the same. If you have to redundantly move code to accomplish this, you may be able to use Extract Superclass to atone. Incomplete Library Class Reuse is often touted as the purpose of objects. We think reuse is overrated (we just use). However, we can't deny that much of our programming skill is based on library classes so that nobody can tell whether we've forgotten our sort algorithms. Builders of library classes are rarely omniscient. We don't blame them for that; after all, we can rarely figure out a design until we've mostly built it, so library builders have a really tough job. The trouble is that it is often bad form, and usually impossible, to modify a library class to do something you'd like it to do. This means that tried-and-true tactics such as Move Method lie useless.

pages: 423 words: 21,637

On Lisp: Advanced Techniques for Common Lisp by Paul Graham

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

If we also have a predicate cara (<- (cara (a _))) which is true of any two-element list whose car is a, then between that and member we have enough constraint for Prolog to construct a definite answer: > (with-inference (and (cara ?lst) (member b ?lst)) (print ?lst)) (A B) This is a rather trivial example, but bigger programs can be constructed on the same principle. Whenever we want to program by combining partial solutions, Prolog may be useful. Indeed, a surprising variety of problems can be expressed in such terms: Figure 24.14, for example, shows a sorting algorithm expressed as a collection of constraints on the solution. 24.4 The Need for Nondeterminism Chapter 22 explained the relation between deterministic and nondeterministic search. A deterministic search program could take a query and generate all the solutions which satisfied it. A nondeterministic search program will use choose to generate solutions one at a time, and if more are needed, will call fail to restart the search.

pages: 429 words: 114,726

The Computer Boys Take Over: Computers, Programmers, and the Politics of Technical Expertise by Nathan L. Ensmenger

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

As the sociologist John Law has suggested, the “heterogeneous engineering” required to assemble such complex systems blurs the boundaries between the technological and organizational, and typically creates a process fraught with conflict, negotiation, disputes over professional authority, and the conflation of social, political, and technological agendas.17 Nowhere is this more true than in the history of software. Software is perhaps the ultimate heterogeneous technology. It exists simultaneously as an idea, language, technology, and practice. Although intimately associated with the computer, it also clearly transcends it. For the most part software is invisible, ethereal, and ephemeral—and yet it is also obviously constructed. Certain aspects of software, such as a sorting algorithm, can be generalized and formalized as mathematical abstractions, while others remain inescapably local and specific, subject to the particular constraints imposed by corporate culture, informal industry standards, or government regulations. In this sense, software sits uncomfortably at the intersection of science, engineering, and business. Software is where the technology of computing meets social relationships, organizational politics, and personal agendas.

pages: 496 words: 70,263

Erlang Programming by Francesco Cesarini

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Example: concatenate([[1,2,3], [], [4, five]]) ⇒ [1,2,3,4,five]. Hint: you will have to use a help function and concatenate the lists in several steps. Write a function that, given a list of nested lists, will return a flat list. Example: flatten([[1,[2,[3],[]]], [[[4]]], [5,6]]) ⇒ [1,2,3,4,5,6]. Hint: use concatenate to solve flatten. 84 | Chapter 3: Sequential Erlang Exercise 3-6: Sorting Lists Implement the following sort algorithms over lists: Quicksort The head of the list is taken as the pivot; the list is then split according to those elements smaller than the pivot and the rest. These two lists are then recursively sorted by quicksort, and joined together, with the pivot between them. Merge sort The list is split into two lists of (almost) equal length. These are then sorted separately and their results merged in order.

Pragmatic.Programming.Erlang.Jul.2007 by Unknown

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

• Generators are written as Pattern <- ListExpr where ListExpr must be an expression that evaluates to a list of terms. • Filters are either predicates (functions that return true or false) or boolean expressions. Note that the generator part of a list comprehension works like a filter, so, for example: 1> [ X || {a, X} <- [{a,1},{b,2},{c,3},{a,4},hello,"wow"]]. [1,4] We’ll finish the section on list comprehensions with a few little examples: Quicksort Here’s how to write a sort algorithm10 using two list comprehensions: Download lib_misc.erl qsort([]) -> []; qsort([Pivot|T]) -> qsort([X || X <- T, X < Pivot]) ++ [Pivot] ++ qsort([X || X <- T, X >= Pivot]). 10. This code is shown for its elegance rather than its efficiency. Using ++ in this way is not generally considered good programming practice. 62 L IST C OMPREHENSIONS (where ++ is the infix append operator): 1> L=[23,6,2,9,27,400,78,45,61,82,14]. [23,6,2,9,27,400,78,45,61,82,14] 2> lib_misc:qsort(L). [2,6,9,14,23,27,45,61,78,82,400] To see how this works, we’ll step through the execution.

pages: 404 words: 43,442

The Art of R Programming by Norman Matloff

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

The word embarrassing alludes to the fact that the problems are so easy to parallelize that there is no intellectual challenge involved; they are embarrassingly easy. Both of the example applications we’ve looked at here would be considered embarrassingly parallel. Parallelizing the for i loop for the mutual outlinks problem in Section 16.1 was pretty obvious. Partitioning the work in the KMC example in Section 16.2.4 was also natural and easy. By contrast, most parallel sorting algorithms require a great deal of interaction. For instance, consider merge sort, a common method of sorting numbers. It breaks the vector to be sorted into two (or more) independent parts, say the left half and right half, which are then sorted in parallel by two processes. So far, this is embarrassingly parallel, at least after the vector is divided in half. But then the two sorted halves must be merged to produce the sorted version of the original vector, and that process is not embarrassingly parallel.

How I Became a Quant: Insights From 25 of Wall Street's Elite by Richard R. Lindsey, Barry Schachter

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

I was a junior member of the team, but by now had established a reputation of being able to manage a project, deliver results, and argue about formulas with clients of the group. So I was given a chance to work on projects as the group expanded its role into quantitative analysis of banking issues. At first, the areas we were allowed to explore were more industrial than financial, such as a Monte Carlo simulation of check processing that was used to optimize a truck delivery routes from the branches to head office and check-sorting algorithms. Then we began to get involved with more financial applications—simulations of how the Bank’s earnings would be impacted by different economic environments, which led to the measurement of risk and return tradeoffs for different asset-liability management strategies. By 1977, I had moved out of the operations research division to head efforts to build models and systems for a new asset-liability Committee and then to start a modeling group to support decision making on the firm’s trading floor.

pages: 527 words: 147,690

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Perhaps, they’ll decide to start directing me past restaurants that advertise with them and locations featuring billboards with their pay-per-gaze technology. As I pass these restaurants, I might receive ads or coupons in my Gmail in-box offering me a discount. Along the way, my entire urban experience potentially comes under the influence of Google. The maps example also raises some of the same problems we run into when thinking about sorting algorithms. Knowing that each action can influence various overseeing algorithms, do we adjust our behavior accordingly? Do we do things just so that the algorithms monitoring us won’t go off track, so that it’ll still “like” the things we like? You might already do a form of this—say, give a thumbs-up to a song on Pandora because you want to hear more of that type (assuming you trust Pandora’s system to usefully recognize a particular song type).

pages: 496 words: 174,084

Masterminds of Programming: Conversations With the Creators of Major Programming Languages by Federico Biancuzzi, Shane Warden

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

If you’re writing a general purpose sorting routine, then you have to call a subroutine or do something like that to determine whether item A is less than item B, whatever that is. If you’re trying to sort keys to records or something, then you have to know the ordering of whatever it is you’re sorting. They may be different kinds of things. Sometimes they may be character strings, but think of what the possibilities are. When you write the sorting algorithm, you don’t know any of that stuff, which means it has to be put in later. If it’s done at runtime, of course, then it’s runtime interpretation. That can be done efficiently, don’t get me wrong, because you can have a little program, a subroutine, and then all the person has to do is, in the subroutine, to write the rules for ordering the elements that he’s sorting. But it isn’t automatic.

pages: 754 words: 48,930

Programming in Scala by Martin Odersky, Lex Spoon, Bill Venners

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

One simple way to do so is insertion sort, which works as follows: To sort a non-empty list x :: xs, sort the Cover · Overview · Contents · Discuss · Suggest · Glossary · Index 329 Section 16.5 Chapter 16 · Working with Lists Table 16.1 · Basic list operations What it is empty.isEmpty fruit.isEmpty fruit.head fruit.tail.head diag3.head What it does returns true returns false returns "apples" returns "oranges" returns List(1, 0, 0) remainder xs and insert the first element x at the right position in the result. Sorting an empty list yields the empty list. Expressed as Scala code, the insertion sort algorithm looks like: def isort(xs: List[Int]): List[Int] = if (xs.isEmpty) Nil else insert(xs.head, isort(xs.tail)) def insert(x: Int, xs: List[Int]): List[Int] = if (xs.isEmpty || x <= xs.head) x :: xs else xs.head :: insert(x, xs.tail) 16.5 List patterns Lists can also be taken apart using pattern matching. List patterns correspond one-by-one to list expressions. You can either match on all elements of a list using a pattern of the form List(...), or you take lists apart bit by bit using patterns composed from the :: operator and the Nil constant.

pages: 1,065 words: 229,099

Real World Haskell by Bryan O'Sullivan, John Goerzen, Donald Stewart, Donald Bruce Stewart

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

First, we import the QuickCheck library,[28] and any other modules we need: -- file: ch11/QC-basics.hs import Test.QuickCheck import Data.List And the function we want to test—a custom sort routine: -- file: ch11/QC-basics.hs qsort :: Ord a => [a] -> [a] qsort [] = [] qsort (x:xs) = qsort lhs ++ [x] ++ qsort rhs where lhs = filter (< x) xs rhs = filter (>= x) xs This is the classic Haskell sort implementation: a study in functional programming elegance, if not efficiency (this isn’t an inplace sort). Now, we’d like to check that this function obeys the basic rules a good sort should follow. One useful invariant to start with, and one that comes up in a lot of purely functional code, is idempotency—applying a function twice has the same result as applying it only once. For our sort routine—a stable sort algorithm—this should certainly be true, or things have gone horribly wrong! This invariant can be encoded as a property simply, as follows: -- file: ch11/QC-basics.hs prop_idempotent xs = qsort (qsort xs) == qsort xs We’ll use the QuickCheck convention of prefixing test properties with prop_ in order to distinguish them from normal code. This idempotency property is written simply as a Haskell function stating an equality that must hold for any input data that is sorted.

pages: 968 words: 224,513

The Art of Assembly Language by Randall Hyde

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Unfortunately, if the data is not sorted (worst case, if the data is sorted in reverse order), then this algorithm is extremely inefficient. Indeed, although it is possible to modify the code above so that, on the average, it runs about twice as fast, such optimizations are wasted on such a poor algorithm. However, the bubble sort is very easy to implement and understand (which is why introductory texts continue to use it in examples). * * * [63] Fear not, you'll see some better sorting algorithms in Chapter 5. 4.22 Multidimensional Arrays The 80x86 hardware can easily handle single-dimensional arrays. Unfortunately, there is no magic addressing mode that lets you easily access elements of multidimensional arrays. That's going to take some work and several instructions. Before discussing how to declare or access multidimensional arrays, it would be a good idea to figure out how to implement them in memory.

pages: 643 words: 53,639

Rapid GUI Programming With Python and Qt by Mark Summerfield

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

If the minutes or notes are passed as None, we take that to mean that they have not been changed. If the movie’s title or year has changed, the movie may now be in the wrong position in the __movies list. In these cases, we ﬁnd the movie using its original title and year, set the new title and year, and then re-sort the list. This is not as expensive in practice as it may at ﬁrst appear. The list will contain, at most, one incorrectly sorted item, and Python’s sort algorithm is highly optimized for partially sorted data. If we ever found that we had a performance problem here, we could always reimplement updateMovie() using delete() and add() instead. @staticmethod def formats(): return "*.mqb *.mpb *.mqt *.mpt" Normally, we would provide one, or at most two, custom data formats for an application, but for the purposes of illustration we provide three formats using four extensions.

pages: 647 words: 43,757

Types and Programming Languages by Benjamin C. Pierce

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

(X→X→Bool) → List X → List X To test that sort is working correctly, we construct an out-of-order list, l = cons [Nat] 9 (cons [Nat] 2 (cons [Nat] 6 (cons [Nat] 4 (nil [Nat])))); sort it, l = sort [Nat] leqnat l; and read out the contents: nth = λX. λdefault:X. fix (λf:(List X)→Nat→X. λl:List X. λn:Nat. if iszero n then head [X] default l else f (tail [X] l) (pred n)); A Solutions to Selected Exercises ñ 549 nth : ∀X. X → List X → Nat → X nth [Nat] 0 l 0; ñ 2 : Nat nth [Nat] 0 l 1; ñ 4 : Nat nth [Nat] 0 l 2; ñ 6 : Nat nth [Nat] 0 l 3; ñ 9 : Nat nth [Nat] 0 l 4; ñ 0 : Nat The demonstration that a well-typed sorting algorithm could be implemented in System F was a tour de force by Reynolds (1985). His algorithm was a little different from the one presented here. 23.5.1 Solution: The structure of the proof is almost exactly the same as for 9.3.9 (see page 107). For the type application rule E-TappTabs, we need one additional substitution lemma, paralleling Lemma 9.3.8 (see page 106). If Γ , X, ∆ ` t : T, then Γ , [X , S]∆ ` [X , S]t : [X , S]T.

The Art of Computer Programming by Donald Ervin Knuth

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

As they stand, formulas A1) are not readily adapted to computer calcula- calculation, since we are asking for a maximum over infinitely many values of x. But from the fact that F(x) is increasing and the fact that Fn(x) increases only in finite steps, we can derive a simple procedure for evaluating the statistics K+ and K~: Step 1. Obtain independent observations Xi, X2, ¦ ¦ ¦ ,Xn . Step 2. Rearrange the observations so that they are sorted into ascending order, X\ < X2 < ¦ ¦ ¦ < Xn. (Efficient sorting algorithms are the subject of Chapter 5. But it is possible to avoid sorting in this case, as shown in exercise 23.) Step 3. The desired statistics are now given by the formulas l<3<n\n An appropriate choice of the number of observations, n, is slightly easier to make for this test than it is for the x2 test, although some of the considerations are similar. If the random variables Xj actually belong to the probability distribution G(x), while they were assumed to belong to the distribution given by F(x), we want n to be comparatively large, in order to reject the hypothesis that G(x) = F(x); for we need n large enough that the empirical distributions 3.3.1 GENERAL TEST PROCEDURES 51 Table 2 SELECTED PERCENTAGE POINTS OF THE DISTRIBUTIONS K+ AND 71 = 1 71 = 2 71 = 3 71 = 4 n = 5 71 = 6 71 = 7 71 = 8 71 = 9 71 = 10 71 = 11 71 = 12 71 = 15 71 = 20 n = 30 n > 30 Vv = p=l% 0.01000 0.01400 0.01699 0.01943 0.02152 0.02336 0.02501 0.02650 0.02786 0.02912 0.03028 0.03137 0.03424 0.03807 0.04354 p = 5% 0.05000 0.06749 0.07919 0.08789 0.09471 0.1002 0.1048 0.1086 0.1119 0.1147 0.1172 0.1193 0.1244 0.1298 0.1351 p = 25% 0.2500 0.2929 0.3112 0.3202 0.3249 0.3272 0.3280 ¦ 0.3280 0.3274 0.3297 0.3330 0.3357 0.3412 0.3461 0.3509 p = 50% 0.5000 0.5176 0.5147 0.5110 0.5245 0.5319 0.5364 0.5392 0.5411 0.5426 0.5439 0.5453 0.5500 0.5547 0.5605 p = 75% 0.7500 0.7071 0.7539 0.7642 0.7674 0.7703 0.7755 0.7797 0.7825 0.7845 0.7863 0.7880 0.7926 0.7975 0.8036 p = 95% 0.9500 1.0980 1.1017 1.1304 1.1392 1.1463 1.1537 1.1586 1.1624 1.1658 1.1688 1.1714 1.1773 1.1839 1.1916 p = 99% 0.9900 1.2728 1.3589 1.3777 1.4024 1.4144 1.4246 1.4327 1.4388 1.4440 1.4484 1.4521 1.4606 1.4698 1.4801 yp - \n~1'2 + O(l/n), where y2p = | ln(l/(l - p)) 0.07089 0.1601 0.3793 0.5887 0.8326 1.2239 1.5174 (To extend this table, see Eqs.