Wednesday, April 29, 2009

More Programming Pearls - Reading notes

==## Part I: Programming techniques ##==

==Column 1== Profilers

- Profilers: 1) line-count, 2) procedure-time
- e.g. Optimize prime testing
* test until square root
* rule out 2, 3, 5
* use multiplication instead of division
* test only previous primes
* simple sieve of Evatosthenes
(complexity: time - O((nlogn)(loglogn)), space - O(n)).
(Complexity with optimizations such as wheel factorization: time - O(n), space -
O(n1 / 2loglogn / logn)

- A specialized profiler
- Building profiler: 1) insert counter to source code, 2) run it, 3) collect counter output.

==Column 2== Associative arrays

- i.e., Associative array in AWK, or Hash in perl.
- e.g., A finite state machine simulator
- e.g., Topological sorting

==Column 3== Confession of a coder

- Scaffolding for testing and debugging
- e.g. Binary search
- e.g. k-th smallest selection
- e.g. A subroutine library

==Column 4== Self-describing data

- Name-value pairs
- Provenances in programming. (metadata, e.g., history of the code)
- e.g. A sorting lab

==## Part II: Tricks of the trade ##==

==Column 5== Cutting the gordian knot (finding a clever solution to a complex problem)

- Gordius tied the knot. Asia was the promised prize to anyone who can untie it. Alexander the Great approached it in 333 B.C. He drew his sword and slashed through the knot. Asia was soon his.
- Mail sorting: instead of let a clerk do the sorting, just let the mailman drop to different mailbox.
- Data transmission: instead of an automobile courier, use a pigeon.
- Random sample: draw from a box.

==Column 6== Bumper-sticker computer science

- pi seconds is a nano century
- Plagiarism is the sincerest form of flattery
- Coding
* If you can't write it down in English, you can't code it.
- User Interface
- Debugging
- Performance
- Documentation
- Managing software
- Miscellaneous rules

==Column 7== The envelope is back (Back of envelop calculation)

- Order of magnitude.
- Little's Law (Law of system flow and congestion): the average of things in the system is the product of the average rate at which things leave the system and the average time each one spends in the system.

==Column 8== The furbelow memorandum

- Larger programmer stuff team leads to larger cost than expected
- Fred Brooke's Mythical Man Month (1975).

==## Part III: I/O Fit for Humans ##==

==Column 9== Little language

- The PIC language

==Column 10== Document design

- Tables
- 3 design principles: interaction, consistency, minimalism
- Figures
- Text
- Medium

==Column 11== Graphic output

==Column 12== A survey of surveys

==## Part IV: Algorithms ##==

==Column 13== Random sampling

- A flawed algorithm
- * Floyd's algorithm
- * Random permutations

==Column 14== Birth of a cruncher

- The problem
- Newton iteration
- Decide starting point
- Coding

==Column 15== k-th smallest selection

- Problem
- Program.
* O(n log(n)) algorithm
* An O(n) algorithm by L.A.R. Hoare: by quick-sort
- Analysis
- Principles

Thursday, April 16, 2009

Programming Pearls - Reading notes

==## Part I ##==Preliminaries

==Column 1== Cracking the Oyster

The major point is find the optimal solution of a problem, instead of going straight with a rash solution.
The coding example is soring with bit vector (aka bitmap).

==Column 2== Aha! Algorithms

- binary search
* search
* find bug by setting checking points in a binary search pattern
* find missing element in an integer range
* finding root for equation: bisection method in numerical analysis

- the power of primitives
Problem: rotating an array ab to ba.
Solutions:
* copying. but this is space inefficient
* juggling
* recursive swapping
* define primitive action reverse(): a b -> a^r b -> a^r b^r -> b a

- sorting
Problem: find anagrams in a dictionary.
Solution: get signature for each word.

==Column 3== Data structures programs

- Use of array
- Structuring data
- Powerful tools for specialized data

Don't write a big program when a little one will do.

==Column 4== Writing correct programs

- binary search - hard to get right
- program verification, invariant

==Column 5== A small matter of programming

- Use assertion for correctness
- Scaffolding
- Automated testing
- debugging
- Timing

==## Part II ##==Performance

==Column 6== Perspective on Performance

- A case study: Andrew Appel's many-body simulation program
- Work at many level to achieve performance improvement:
* problem definition
* system structure
* algorithms and data structures
* code tuning
* hardware

==Column 7==The back of the envelop

- Calculation by reasonable estimation
- Quick check: Test by dimension
- Rules of thumb: e.g., 1) Rule of 72 (for exponential increase), 2) pi seconds is a nanocentury.
- Performance estimates, and little experiments
- Safety factors: compensate ignorance with extra safe factors
- Little's Law: queue size = consumption rate * average wait time

==Column 8==Algorithm design techniques

- Problem: range of array for max sum
- Solutions:
* cubic
* quadratic
* Divide and conquer (n log(n))
* scanning (linear)

==Column 9==Code tuning

- Prevent premature optimization
- Optimization should be made on the bottle-neck part - profiling the program

==Column 10==Squeezing space

- The key is simplicity
- Example: sparse matrix representation of grid.
- When simplicity is not sufficient, there are skills to better utilize space:
* recompute
* sparse data structure
* data compression
* allocation policies
* garbage collection

==## Part III ##==The Product

==Column 11==Sorting

- Insertion sort
- Quick sort

==Column 12== A sample problem

- Problem: sampling: select m from n integers
* by selection
* by shuffling
- Principles: understand the problem, specify an abstraction, explore design space, implement, retrospect.

==Column 13== Searching

- Problem: store a set of integers (w/o associated data).
- linear structure
- binary search trees: STL, BST, BST*, Bins, Bins*, BitVec
- structures for integers

==Column 14== Heaps

- Heap, Priority Queue, Heap sort
Comment: the material here can be found in any data structure and algorithm textbook, nothing new.

== Column 15== Strings of pearls

- We are surrounded by strings
- Words. 1) Map, Set, 2) Hash (no worst case guarantee, no order information)
- Phrases.
* the longest substring problem - solved by suffix array
- Generating sentences

==Appendix 1== A catalog of algorithms

- Sorting
- Searching
- Other Set algorithms
- Algorithms on Strings
- Vector and Matrix algorithms
- Random objects
- Numerical algorithms

==Appendix 4== Rules for code tuning

- Space-for-time rules
- Time-for-space rules
- Loop rules
- Logic rules
- Procedure rules
- Expression rules

Blog Archive

Followers