Saturday, December 21, 2013

Number system without 7

A special country does not use the number 7. Whenever a 7 is encounter, the next possible
number is used. For example, 7 -> 8, 17 -> 19. Given a number in the country, translate it
into our number.

Solution:

Obviously, this is a base-9 number system, made of digits 0,1,2,3,4,5,6,8,9, where 8 is
actually 7, and 9 is actually 8.

If a number is given as d1d2d3...dn, then the translation is:

D1 * 9^(n-1) + D2 * 9^(n-2) + .. + Dn * 9^0,

where Di = di if di = 0-6,
Di = di - 1 if di = 8 or 9.

Code verification is done below.


#include <iostream>
using namespace std;

// return true if x contains digit 7.
bool contains7(int x) {
    while (x > 0) {
        int y = x % 10;
        if (y == 7) return true;
        x /= 10;
    }
    return false;
}

void getX(int & x) {
    do {
        ++ x;
    } while (contains7(x));
}

// convert new number system number back to normal number.
int convert_back(int x) {
    int v = 0;
    int base = 1;
    while (x > 0) {
        int y = x % 10;
        if (y == 8 || y == 9) y -= 1;
        v += y * base;

        base *= 9;
        x /= 10;
    }
    return v;
}

int main() {
    int x = 0; // new number system number that ignores 7.
    for (int i = 1; i < 1000; ++ i) {
        getX(x);
        string result = (i == convert_back(x) ? "ok" : "err");
        cout << i << "=> " << x << " " << result << "\t";
        if (result == "err") break;
        if (i % 5 == 0) cout << endl;
    }
    cout << endl;
    return 0;
}

Find meeting point

I. Manhattan distance

In a city there are N persons. A person can walk only horizontally or vertically. Find a point that minimizes the sum of distances all persons walk to the point.

This is called the manhattan distance since a person can walk only horizontally or vertically, like in the city of Manhattan.

Let's assume this is 1-dimensional. Then for 2 persons, the point can be any point on the line connecting the 2 persons. For 3 persons , the point is the middle person. For 4 persons, the point can be any point between the middle 2 persons. For 5 persons, the point is the middle person. In general, for odd number of persons, it's the person in the middle; for even number of persons, it's any point between the middle 2 persons.

This can be extended to 2-dimensional. The answer is the point (x, y), where x and y are the median points taken independently of all the xi and yi for i = 1 to n. [1][2]

II. Geometric median.

The geometric median of a discrete set of sample points in a Euclidean space is the point minimizing the sum of distances to the sample points [3]. This no longer requires the path be horizontal or vertical.

Despite the simple form, the solution is much more complex than the similar problem of finding the center of mass, which minimizes the sum of the squares of distances of the points to the center. There is no simple formula for the solution.

For 2 points, it's any point on the line connecting the 2 points.

For 3 non-collinear points, the problem is known as Fermat's problem. Solution is in [3]. For 4 co-planar points, the solution is also in [3].

For more points, the solution can be approximated by numerical methods such as the Weiszfeld's algorithm.

References:
[1] Shortest distance travel - common meeting point
[2] Algorithm to find point of minimum total distance from locations
[3] Geometric median

Saturday, August 10, 2013

Is It Down Right Now

Is It Down Right Now

Interesting website to check if your favorite web sites are down right now.

Programming languages used in most popular websites

Programming languages used in most popular websites

Front end all use JavaScript.

Back end, still these main stream languages: C/C++, Java, PHP, .NET. Also some Python, Perl, Ruby, Scala etc.

Tuesday, July 16, 2013

Break captcha in Chinese

Most captchas are in English/numbers.

Here is an interesting video on how to break captchas in Chinese. The key is this writing recognition site http://www.nciku.com/. This is a nice site to help learning Chinese.

Here the Chinese character is not skewed, so it's easy to draw it. It'll be harder if the characters are skewed.

But ultimately, to make it safer, captcha better incorporate some semantic meaning, like asking for an answer to a question with culture background, instead of just typing the character.

Essential PHP security

Essential PHP security. By Chris Shflett. 2006.

Nice book. Basic rules can apply to sites built in other languages.


ADO.NET 4 Database Programming 2010

Murach's ADO.NET 4 Database Programming with VB 2010, 4th Edition. By Anne Boehm, Ged Mead.

- ADO.NET connection types: sql, ole, odbc
- ADO.NET objects: Connection --> command --> 1) data reader, 2) data adaptor --> dataset, binding.
- Stored procedures and parameters
- Transaction. Begin, Commit, Rollback, Savepoint
- GridView/DetailsView: add/edit/delete, select/multi-select, paging. Can implement these myself.
- XML
- LINQ: to: xml/sql/entity/dataset/objects
- Entity Framework

Tuesday, July 9, 2013

Visual C# 2008/2012. By John Sharp

p.151. Nullable type.
   int i = null; // wrong
   int? i = null; // right

p.152. ref, out (must assign value in method).
   Need to declare variable as ref/out at both calling site and function definition.

p.176. Class & Struct
                                                                          struct                    class
type                                                                    value                    reference
live on                                                                 stack                   heap
can declare default ctor?                                      no                       yes
after declare ctor, default ctor auto-created?        yes                      no
automatic initialize fields?                                     no                       yes
can initialize instance fields at declaration?            no                       yes

- Collection:
   - Hashtable
   - SortedList - sorted hash table (RB tree?)

p.207. Parameter Arrays. - variable param list.
    - void func(params int[] a) {...}

p.217. CH 12. Inheritance
   - new, virtual, override, hiding/overriding, protected.
   - extension methods?
   - public, protected, private, internal, protected internal 
   - internal: Internal types or members are accessible only within files in the same assembly,
   - protected v.s. protected internal:
      - protected; derived types may access the member.
      - protected internal; only derived types or types within the same assembly can access that member,
        so they need to be in the same Dynamic Link Library or an executable file.

p.239. Interface, Abstract class, Sealed class.
   - interface => virtual => override => sealed

p.274. Property, get/set

p.295. CH 16. Indexer?

p.311. delegate, event

p.333. CH 18. Generics.
    Queue myQ = new Queue();

p.371. LINQ. Language Integrated Query
   - LINQ, DLINQ, XLINQ.
     - Linq is a programming model that introduces queries as a first-class concept into any Microsoft .NET language
     - DLinq (Linq to SQL) is an extension to Linq that allows querying a database and do object-relational mapping.
     - XLinq (Linq to XML) is an extension to Linq that allows querying/creating/transforming XML documents.
   - After ASP.NET 4.0, emphasis is on Entity Framework, which replaces LINQ.

   - Linq v.s. Entity Framework. (some explanation)
     - LINQ to SQL only supports 1 to 1 mapping of database tables, views, sprocs and functions available in Microsoft SQL Server. It's a great API to use for quick data access construction to relatively well designed SQL Server databases. LINQ2SQL was first released with C# 3.0 and .Net Framework 3.5.
    - LINQ to Entities (ADO.Net Entity Framework) is an ORM (Object Relational Mapper) API which allows for a broad definition of object domain models and their relationships to many different ADO.Net data providers. As such, you can mix and match a number of different database vendors, application servers or protocols to design an aggregated mash-up of objects which are constructed from a variety of tables, sources, services, etc. ADO.Net Framework was released with the .Net Framework 3.5 SP1.

p.420. XAML. Extensible Application Markup Language.
    - WPF - XAML - define interface by XML(XAML), independent from application logic.

p.523. DLINQ. Based on ADO.NET. Data LINQ.

p.557. PART VI. Build web app.
   - ASP.NET server control
   - HTML control (runat="server")
   - theme
   - web forms validation controls.

p.623. Web service
   - REST: request by specifically formatted URL
   - SOAP: request by XML message.

Sams Teach Yourself SQL in 24 hours

CH 4. p.61.
   Normalization - reduce redundancy (will reduce performance due to more JOINs, will use more CPU/mem/IO).
   Denormalization. Combines tables, controlled redundancy. Increased performance.

CH 6. Transaction.
   Commit, Rollback, Savepoint

CH 8. All, Some, Any

CH 9. Aggregate functions.

CH 10. Sorting and Grouping.
   Group by.
   - rollup - get subtotal
   - cube - crosstab reports
   - having - GROUP BY/HAVING is similar to SELECT/WHERE
   - p. 243. UNION (no duplicate rows), UNION ALL (including duplicate rows).
   - INTERSECT
   - EXCEPT

CH 16. p. 256. Indexes.
   - When to avoid using indexes.. p. 261

CH 17. Improve DB performance
   - DB tuning / SQL tuning
   - To avoid full table scan, then use Index

CH 18. Manage DB Users
   - Schema - a collection of DB objects that a user owns.
   - DB user - aschema owner
   - Default schema - dbo (db owner)

CH 19. p. 299. Manage DB security.
   - Privilege
   - Control user access. GRANT, REVOKE, ROLE

CH 20. View, Synonym

CH 21. System Catalog

CH 22. Advanced SQL

MSSQL Server 2008

p.103. CH 7. Partitioning

p.219. CH 15. DB snapshots. CREATE database AS Snapshot

p.375. Part VII. Business Intelligence.
CH24. SSIS - SQL Server Integration Services
CH25. SSRS - SQL Server Reporting Services
CH26. SSAS - SQL Server Analysis Services

SSMS - SQL Server Management Studio

Big Data for Dummies

- Part I. Big Data. 3 characteristics: volume, velocity, variety
   - technology: MapReduce, BigTable, Hadoop (started at Yahoo)
- Part II. Tech foundations. p. 68. Hypervisor
   - cloud: Amazon (EC2, 2006), Google (Big Data), MS (Azure), Rackspace, NASA (OpenStack).
- Part III. Management
   - CH 7. Relational Database. CRUD, ACID.
                Non-relational Database.
   - CH 8. MapReduce
   - CH 9. Hadoop
- Part IV. Analytics & Big Data
   - p.145. Data Mining: classfication, log regression, NN, clustering (k-means etc).

Some C#, ASP.NET and SQL books

Browsed some C#, ASP.NET and SQL books. Quickly get through large amount of material, to review and refresh old knowledge, and gain an understanding of new advancement. Most of these books are written for beginners. However, each more or less covers things I did not notice in the past.

- ASP.NET 2.0 in C# 2005
- Learn Microsoft Visual C# 2010. By John Paul Mueller
- murach's ASP.NET web programming with C# 2010, 4th Edition.
- Visual C# 2008. By John Sharp
- Visual C# 2012. By John Sharp. (This book is a minor update from the 2008 version)
- Sams Teach Yourself SQL in 24 hours
- MS SQL Server 2008, by Mike Hotek.
- Big Data for Dummies

Long words short, ASP.NET evolution (see wiki page on ASP.NET):

- 2002.1   1.0  OO, based on windows programming, can use DLL. ADO.NET. VS.NET
- 2003.4   1.1  Automatic input validation; mobile controls. Bug fix, performance increase. VS.NET 2003.
- 2005.11  2.0 Major updates: Partial class, Generics, Anonymous methods, Iterators, master page, theme, navigation, Grid/Form/DetailsView, Login, skin etc. VS.NET 2005
- 2006.11  3.0 WCF, WPF, WFF,
- 2007.11  3.5 MVC (easier to test and for plugable IoC containers etc.), Silverlight, LINQ, Ajax, ADO.NET Entity Framework, ListView, DataPager etc. VS.NET 2008. Windows Server 2008.
- 2012.4    4.0 Parallel extensions.
- 2012.8    4.5 VS.NET 2012. Windows Server 2012. Window 8.

Wednesday, July 3, 2013

sed

Use sed to replace some string in a large file (example tested is 2GB in size):

sed -i 's/old_value/new_value/g' filename

Some more examples are there. It seems like in vi using "%s/old_value/new_value/g" is much slower.

Saturday, June 15, 2013

Some more books read recently

编程之美.
数学之美. 吴军著.
Big Data (大数据时代, 2013). Viktor Mayer-Schonberger, Kenneth Cukier.
晓说2. 高晓松著.

Wednesday, June 12, 2013

Books on Programming Contests

算法艺术与信息学竞赛. 刘汝佳, 黄亮著 (2003). 经典著作. 高度推荐. 理论介绍全面深入, 一些别的书不常见的内容这里也有. 难度较高. 有大量经典题目. 2003版为一册. 最近新版(算法竞赛入门经典(2009), 算法竞赛入门经典--训练指南(2012), 算法艺术与问题求解)更新为三册. 百度百科简介.

ACM国际大学生程序设计竞赛. 俞勇主编 (2013). (ISBN 978-7-302-29413-9). 分四册. 全面精简, 书薄, 总结得好, 易读易懂易记. 高度推荐.

实用算法的分析与程序设计(1998)/新编实用算法的分析与程序设计(2008). 吴文虎、王建德编著. 经典老著作.

程序设计中使用的数据结构. 王建德, 吴永辉编著 (2012). (ISBN 978-7-115-26865-5). 组织分成三部分: 线性表, 树, 图. 比较全面. 例题有一定难度. 但是伪代码难读, 排版需要改进, 感觉也不是太有必要包括, 读者不太可能去仔细阅读伪代码.

ACM程序设计(国际大学生程序设计竞赛指南). 曾棕根 编著. 第二版(2011). 内容浅显, 非常适合入门, 推荐给刚刚开始接触程序竞赛的人. 比较有用的是第二章.C++STL泛型编程, 详细介绍了STL.

其它有很多关于真题集的书. 有兴趣有时间可以看. 不然的话, 以上的书打好基础完全够用了.

C++ weblog

Saturday, March 23, 2013

Monitor url content

if wget -q [url] -O - | grep [keyword regular expression] > /dev/null
then
   sendmail -t < From: from@email.com
To: to@email.com
Subject: your key word appeared
in URL
FOO
fi


Add this script to crontab. 

Monday, March 18, 2013

T-SQL III

1. Examples of using sp_executesql with output parameters.

Two scenarios: 1) execute a query, 2) execute a stored procedure.
See [1] http://support.microsoft.com/kb/262499

-- Example 1.

DECLARE @SQLString NVARCHAR(500)
DECLARE @ParmDefinition NVARCHAR(500)
DECLARE @IntVariable INT
DECLARE @Lastlname varchar(30)
SET @SQLString = N'SELECT @LastlnameOUT = max(lname)
                   FROM pubs.dbo.employee WHERE job_lvl = @level'
SET @ParmDefinition = N'@level tinyint,
                        @LastlnameOUT varchar(30) OUTPUT'
SET @IntVariable = 35
EXECUTE sp_executesql
@SQLString,
@ParmDefinition,
@level = @IntVariable,
@LastlnameOUT=@Lastlname OUTPUT
SELECT @Lastlname
 
-- Example 2.
CREATE PROCEDURE Myproc
    @parm varchar(10),
    @parm1OUT varchar(30) OUTPUT,
    @parm2OUT varchar(30) OUTPUT
    AS
      SELECT @parm1OUT='parm 1' + @parm
     SELECT @parm2OUT='parm 2' + @parm
GO
DECLARE @SQLString NVARCHAR(500)
DECLARE @ParmDefinition NVARCHAR(500)
DECLARE @parmIN VARCHAR(10)
DECLARE @parmRET1 VARCHAR(30)
DECLARE @parmRET2 VARCHAR(30)
SET @parmIN=' returned'
SET @SQLString=N'EXEC Myproc @parm,
                             @parm1OUT OUTPUT, @parm2OUT OUTPUT'
SET @ParmDefinition=N'@parm varchar(10),
                      @parm1OUT varchar(30) OUTPUT,
                      @parm2OUT varchar(30) OUTPUT'

EXECUTE sp_executesql
    @SQLString,
    @ParmDefinition,
    @parm=@parmIN,
    @parm1OUT=@parmRET1 OUTPUT,@parm2OUT=@parmRET2 OUTPUT

SELECT @parmRET1 AS "parameter 1", @parmRET2 AS "parameter 2"
go
drop procedure Myproc
2. Use FETCH to go through a cycle.

-- Example 3. Note the use of: 1) cursor, 2) fetch, 3) raiserror to print without wait.

DECLARE @tbl varchar(20)
DECLARE c CURSOR FOR SELECT empID FROM test.dbo.Employee
OPEN c

FETCH NEXT FROM c INTO @tbl
WHILE @@FETCH_STATUS = 0
BEGIN
    --print @tbl
    RAISERROR(@tbl, 0, 1) WITH NOWAIT
    FETCH NEXT FROM c INTO @tbl
END

CLOSE c
DEALLOCATE c

Sunday, March 17, 2013

T-SQL II

Follow T-SQL I. Now assume some employees make phone calls. How to get the total amount of phone call time of the employees managed by a manager?

Say manager's empID is 1, then it's easy to get:

select SUM(phoneusage) AS PhoneUsage from Employee where mgrID = 1

What if you want the manager's name etc. to be returned on the same row? You could do this:

select A.EmpID, A.ManagerName, B.PhoneUsage FROM
(
    (select ID = 1, EmpID, (firstname + ' ' + lastname) AS ManagerName from Employee where empID = 1) A
    INNER JOIN
    (select ID = 1, SUM(phoneusage) AS PhoneUsage from Employee where mgrID = 1) B
    ON A.ID = B.ID
)

Or, you can first get the phone usage of employees for all the managers:

SELECT A.empID, (A.firstName + ' ' + A.lastName) as [ManagerName], B.phoneUsage FROM
(
    (select mgrID, SUM(phoneUsage) as phoneUsage from Employee group by mgrID) as B
    INNER JOIN Employee A
    ON A.empID = B.mgrID
)

Then you can add "AND A.empID = 1" to get the phone usage of employees managed by manager whose empID is 1:

SELECT A.empID, (A.firstName + ' ' + A.lastName) as [ManagerName], B.phoneUsage FROM
(
    (select mgrID, SUM(phoneUsage) as phoneUsage from Employee group by mgrID) as B
    INNER JOIN Employee A
    ON A.empID = B.mgrID
    AND A.empID = 1
)

The last method is better than the first method, since it does not use an artificial ID.

Parsing expression grammar (PEG)

Parsing expression grammar or PEG is a type of analytic formal grammar. It was first proposed by Bryan Ford in 2004. It is related to the top-down parsing model.

It looks similar to CFG, but different in that it uses "/" instead of "|" to specify the derivation choices of a non-terminal. In doing so it also specifies that the rules delimited by "/" are chosen in the order of appearance. This in effect adds a priority order to the rules involved, and avoids relevant ambiguity. Say in a CFG state you have two rules in shift/reduce or reduce/reduce conflict, which in CFG needs extra priority/precedence specification, now in PEG one just choose the one with high priority. This way it claims that: unlike CFG, PEG cannot be ambiguous.

Any PEG can be parsed in linear time by using a packrat parser.

Advantages of PEG include: no ambiguity; more powerful than regular expressions (this does not sound a very special advantage though, since LL/LR are also more powerful than regular expressions); does not need a separate tokenization step (such as by lex), tokenization rules in PEG can be written the same way as other grammar rules.

Disadvantages of PEG include: large memory consumption; cannot use left-recursion (same as LL); may contain subtle grammar errors and the author needs to tweak the grammar a bit; cannot recognize some unambiguous non-deterministic grammars, for example: S ← 'x' S 'x' | 'x'. Note that LL and LR also cannot recognize this, but CYK algorithm can.

Reference:
[1] http://en.wikipedia.org/wiki/Parsing_expression_grammar

Friday, March 15, 2013

T-SQL I

--
-- This script solves the problem:
-- Given an Employee table, each rows has empID, mgrID, firstName, lastName,
-- return all the managers in hierarchy for a given employee.
-- This is easy, but can also be extended to something of medium complexity.
--
-- This T-SQL script demonostrates:
-- +  create/drop stored procedure
-- +  declare table and variable
-- +  while loop
-- +* assign parameter in dynamically constructed query
-- +  stored procedure returns variable or table
-- + convert data type
-- + execute stored procedure in t-sql
--
-- @execute from command line: sqlcmd -S localhost -d test -i getList.sql
--   -s: server, -d: database, -i: input file
--
-- @Author: HomeTom
-- @Created on: 3/15/2013
-- @Last modified: 3/15/2013
--

--if object_id('dbo.getList') is not null
--    drop procedure dbo.getList
--go

--create procedure getList
--  @ID varchar,
--  @EID int output
--as
begin

SET NOCOUNT ON;

declare @tbl table (
    empID varchar,
    name  varchar(50)
)

declare @empID varchar --= 5
declare @mgrID varchar
declare @name varchar(100)
declare @cond int = 1
declare @query nvarchar(512)

set @empID = 5 --@ID

while @cond = 1
BEGIN
    --print @empID
    -- CONVERT(varchar, @empID)

    IF NOT EXISTS (select empID from Employee WHERE empID = @empID) BREAK

    set @query = 'select @name = firstname + '' '' + lastname from Employee WHERE empID = ' + @empID
    exec sp_executesql @query, N'@name varchar(100) output',  @name = @name output
    insert into @tbl (empID, name) values (@empID, @name)
   
    -- get manager's empID.
    set @query = 'select @empID = mgrID from Employee WHERE empID = ' + @empID
    exec sp_executesql @query, N'@empID varchar output',  @empID = @empID output
    if @empID is null set @cond = 0
END

select * from @tbl

--select @EID = -999

end
go


-------------------
-- execute getList
-------------------

--USE [test]
--GO

--DECLARE    @return_value int,
--        @EID int

--SELECT    @EID = 1

--EXEC    @return_value = [dbo].[getList]
--EXEC    [dbo].[getList]
--        @ID = N'1',
--        @EID = @EID OUTPUT

--SELECT    @EID as N'@EID'
--select @EID

--SELECT    'Return Value' = @return_value

--GO

-------------------
-- Table Employee
-------------------

-- CREATE TABLE [dbo].[Employee](
--     [empID] [int] NOT NULL,
--     [mgrID] [int] NULL,
--     [firstName] [varchar](50) NULL,
--     [lastName] [varchar](50) NULL,
--  CONSTRAINT [PK_Employee] PRIMARY KEY CLUSTERED
-- (
--     [empID] ASC
-- )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
-- ) ON [PRIMARY]


Tuesday, February 12, 2013

SOAP in PHP


See: http://www.perl.com/pub/2001/01/soap.html
SOAP server: server.cgi
#!/usr/bin/perl -w
use SOAP::Transport::HTTP;

SOAP::Transport::HTTP::CGI     
    -> dispatch_to('Demo')       
    -> handle;

package Demo;

sub hi {                       
    return "hello, world\n";       
}

sub languages {                
    return ("Perl", "C", "sh");  
}

SOAP client: client.pl
#!/usr/bin/perl -w
use SOAP::Lite;

$s = SOAP::Lite
    -> uri('http://localhost/Demo')
    -> proxy('http://localhost/DFetch_stamp/tmp/hibye.cgi')
    -> hi()
    -> result; # can use result() too.
print $s;
 
Now run: perl client.pl, you can see the output.

Other notes:
- multiple line comments in perl: start with "=pod", end with "=cut".

Blog Archive

Followers