Tag Archive : problems

/ problems

Digital Forensics : Ann’s Aurora

December 11, 2015 | Labs, Writeups | No Comments

Ann’s Aurora is a digital forensic challenge created by SANS Digital Forensics & Incident Response (DIFT). This problem is released as a public challenge for communities. The problem can be considered as network forensic released as per 2011 with not so complicated difficulty. The challenge can be seen on this link.

This article will discuss about solution for Ann’s Aurora case.

All steps performed here has been tested on following environment:

  1. Windows 8.1 64-bit
  2. Linux Slackware 14.1 64-bit
  3. Sans Institute Forensics Toolkit (SIFT) Linux version 3.0

And tools used here is:

  1. Wireshark

Most of screenshot taken here are result of test performed on machine [1], however we have ensure that all steps are applicable to machine [2] and [3].

Incident Description

Ann Dercover want to steal secret recipe from SaucyCorp. She then tailing lead developer Vick Timmes and investigate how to access SaucyCorp server. Ann know Vick login with his laptop computer (IP 10.10.10.70) and do VPN connection to SaucyCorp.

Ann then use 0-day exploit for Internet Explorer and attack with Client-Side Spear Phising Attack to fool Vick Timmes browser.

List of Goals

Basically, this case has following foals:

  1. Know full URI of web request done by Vick Timmes.
  2. Know content of 1300-element wide array labeled as COMMENT which is response from Ann.
  3. Know second HTTP request done by Vick
    • Requested file name
    • MD5 hash value of requested file name
  4. Know time of TCP session establishment on port 4444
  5. Know time of TCP session closed on port 4444
  6. Know type and MD5 hash value of packet 17 from server.
  7. To Analyze Vick’s computer behavior when trying to establish connection to server on port 4445 after disconnected from port 4444.
    • How fast TCP Initial Sequence Number does changed.
    • How fast IP ID does changed.
    • How fast source port does changed.
  8. Know time of connection to port 4445 successfully established.
  9. Know MD5 hash value of data sent on port 4445 by server.
  10. Know time of TCP closed on port 4445.

These goals is summary of questions / goals proposed by the challenge, which can be seen on following link.

Evidences

The only evidence we have is a pcap formatted file “evidence06.pcap”. This file is records of all activities in network at incident time. The evidence can be downloaded freely on this link. To download it, you have to login to member area, which you can register there free.

Analysis

Preliminaries

File evidence06.pcap has MD5 hash: efac05c50c0ae92bf0818e98763920bd

Following screenshot prove it. This value should be similar to value written on challenge page.

uas_1

Know we want to know some general information about it.

  1. How many IP address involved?
  2. How many pair of (source,destination) port involved?

This means to grab raw estimation about scope of analysis.

For first goal, we use tshark which is command line version of wireshark. Our attention go to source and destination address recorded on this file.

Sender addresses can be extracted by:

tshark –r evidence06.pcap –T fields –e ip.src | sort | uniq

Which has following result:

uas_2

It should be clear that there are two unique address as sender address on this file:

  • 10.10.10.10
  • 10.10.10.70 (Vick Timmes)

We guess that 10.10.10.10 is Ann Dercover machine.

Destination addresses can be extracted by:

tshark –r evidence06.pcap –T fields –e ip.dst | sort | uniq

Which has following result:

uas_3

It is clear that the addresses which involved as destination address are:

  • 10.10.10.10 (probably Ann Dercover)
  • 10.10.10.70 (Vick Timmes)
  • 10.10.10.255

Address 10.10.10.10 and 10.10.10.70 exists on both side, we then make a hypothesis that both machine has direct communication each other. While address 10.10.10.255 only appear on destination side. We guess that this address is server of SaucyCorp server which is targeted by Ann Dercover.

Next we want to know pair of port used for data exchange. There are two type of transfer protocols: TCP and UDP.

tshark –r evidence06.pcap –T fields –e tcp.port | sort | uniq

Following is the result:

uas_4

And UDP port pairs:

tshark –r evidence06.pcap –T fields –e tcp.port | sort | uniq

With following result:

uas_5

Note that all pairs are not representing established connection. We know that one connection might use different pair on runtime which is quite common.

Attack Vector used by Ann is Microsoft Internet Explorer “Aurora” Memory Corruption. This exploit is indexed as MS10-002 pada bulletin Microsoft Security (https://technet.microsoft.com/library/security/ms10-002) or CVE-2010-0249 on Common Vulnerabilities and Exposures (http://www.cvedetails.com/cve/2010-0249).

Basically, this exploit attack browser Internet Explorer by the way Internet Explorer handle deleted objects. This exploit first create massive-size temporary garbage data. Next when the data is deleted, exploit will point to that memory location and attack can plan malware there. In this case, the payload used is Trojan Hydraq.

Goal I Solution (Full URI)

Seeing first packet we can get full URI as following:

uas_6

uas_1a

In other words, URI accessed by Vick is: http://10.10.10.10:8080/index.php

First packet is a part of HTTP request to get index page. Response to this request is packet indexed as packet 2 to packet 6 and also packet 8, which are subject to Goal 2.

Goal II Solution (Response String)

Second packet is an ACK message of first HTTP request, while the response text are given as packet 3 to packet 6 and also packet 8. Five of them are pieces of HTML file which is content of index page.

To get all of those fragments, we should track every packet involved. Right click on the first packet and use “Follow TCP Stream”. Make sure we use “Raw” option. The message is plaintext, so we can copy the content directly.

The content of that file can be seen on this link.

To ease analysis, we can do identifier substitution. Identifier substitution can be done without alter the script logic. The modification can be seen on this link.

In this part, we wan to know what the content of 1300 object array are. This can be done by source code analysis which refer to simplified and deobfuscated code.

The code will execute function “func2(event)” when the page is loaded for first time. Function func2() will then create a time which will execute func3 on next 50 milliseconds. This means, func3() will be executed, whatever it is. While func1() only executed once when executing func2(). Based on our knowledge that attack vector used is CVE-2010-0249, we know that func1() is responsible to generate massive-size temporary object which later be deleted.

About 1300 object created on array var2. Data on this array is a unicode-escaped string. The string is:

\u0c0f\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d
\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d\u0c0d

Or can be interpreted as following strings:

ఏ఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍఍

This massive object is used to flood the system and create perfect condition to execute the exploit, see further on (http://www.symantec.com/connect/blogs/trojanhydraq-incident-analysis-aurora-0-day-exploit).

Goal III Solution (Second HTTP Request)

Second HTTP request is done to obtain http://10.10.10.70:8080/ index.phpmfKSxSANkeTeNrah.gif

It is a GIF file on iframe which is automatically downloaded by browser. Request to this file is pointed by packet 9. While the response itself is on packet 11.

The requested file is a GIF file which can be seen by the signature in header as following:

uas_8

Right click on “Compuserve GIF, Version: GIF89a” in packet content and choose “Export Selected Packet Bytes”. Save it as GIF.

Our MD5 value calculation give us df3e567d6f16d040326c7a0ea29a4f41.

uas_9

Goal IV Solution ( Opening TCP Session Port 4444 )

To know time of connection establishment, we need to build basic understanding:

  • Destination IP address is 10.10.10.10
  • Destination port is 4444

Thus, we can set rule on Wireshark as following:

ip.dst==10.10.10.10 && tcp.dstport==4444

Knowing TCP three-way handshake, we know that connection is established when TCP handshake finished. It is marked by ACK message from initiator (one who want to connect). In this case, 10.10.10.70 to 10.10.10.10.

Inspection will lead us to packet 13 and 15. Packet 13 is a SYN message to start TCP handshake and packet 15 is ACK acknowledge and finish TCP handshake. Beyond this point, TCP session has been established.

Packet 15 is on 1.3 seconds offset relative to first packet seen on this file. It means the connection is opened on 1.3.

Goal V Solution ( Closing TCP Session Port 4444 )

To know time of connection closed, we need to build basic understanding:

  • Destination IP address is 10.10.10.10
  • Destination port is 4444

Thus, we can set rule on Wireshark as following:

(ip.dst==10.10.10.10 && tcp.dstport==4444) || (ip.dst==10.10.10.70 && tcp.srcport==4444)

Knowing TCP four-way handshake to close connection, we got idea about how TCP connection closed. It is marked by ACK from initiatior. However, we cannot point yet who become initiator, whether 10.10.10.10 or 10.10.10.70 both have possibilities. Therefore we need to look it on two conditions.

Inspection will lead us to pair packet 1563, 1565 and 1562,1564. Packet 1562 and packet 1563 are packet [FIN,ACK] where packet 1562 is sent first from 10.10.10.10 to 10.10.10.70.Therefore we can conclude that the server has initiative to end the connection after it finish transmitting. ACK packet is sent by 10.10.10.10 on packet 1564 to state end of connection.

Packet 1564 is on 87.6 second, which means the connection ended on 87.6

Goal VI Solution( Sent File Type )

Inspected packet is packet 17. We need to check header of the payload and search for some signatures.

uas_7

Here, we see that MZ is a signature for PE32 executable file. We then have confidence after see string “This program cannot be run in DOS mode” which is backward compatibility offered by Microsoft to ensure program is still recognized by older Windows but not for running purpose.

Therefore, the file sent is a Windows executable or PE32.

To do extraction, we need to collect fragments by using “Follow TCP Stream” on packet 17. Click “Save As” to store it, let say as “suspect1.exe”

MD5 checksum can be computed by following: b385b7cf73ba0e75b20e4fdb310717c2

uas_10

Goal VII Solution ( Vick’s Computer Behavior )

For this goal, we need to use following understanding

  • Destination IP address is 10.10.10.10
  • Destination port is 4445

Thus, we can set rule on Wireshark as following:

(ip.dst==10.10.10.10 && tcp.dstport==4445) || (ip.dst==10.10.10.70 && tcp.srcport==4445)

By previous goal (goal VI), we know that connection to port 4444 is closed on packet 1564, 87.6 second. Therefore we focus our attention to packet after 1564, which are 1566 and 1656 (when 10.10.10.10 send RST notification and 10.10.10.70 send SYN back).

1. Initial Sequence Number

Initial Sequence Number is a unique 32-bit embedded to any new connection on TCP-based connection. ISN value expected to be random. Sequence number is the 4th octet on TCP header.

On packet 1566, ISN is 26. Here is the table of first 5 ISN number:

Initial Sequence Number # Packet Entry
26 1557 (6 entries)
27 1568 (6 entries)
28 1574 (6 entries)
29 1580 (6 entries)
30 1586 (6 entries)

 

In this pattern we get conclusion that ISN changing is done for every three packets. This is based on knowledge that the entries are actually pair of request-response.

2. IP ID

IP ID or Internet Protocol Identification is 16-bit number used to identify fragment groups from IP datagram. IP ID is the 4th octet of IP header.

On packet 1566, IP ID is 553. Here is table of first 10 IP ID seen:

IP Identification # Packet Entry
553 1566 ( 2 entries)
554 1568 ( 2 entries)
555 1570 ( 2 entries)
556 1572 ( 2 entries)
557 1574 ( 2 entries)
558 1576 ( 2 entries)
559 1578 ( 2 entries)
560 1580 ( 2 entries)
561 1582 ( 2 entries)
562 1584 ( 2 entries)

 

By this pattern, we can conclude that IP ID is changing for every packet sent. This is based on knowledge that the entries are actually pair of request-response.

Note: IP header in this file is referred to IPv4 header.

3. Source Port

Source Port is 16-bit number used identify port for sending packet. This number is the 0-th octet on TCP header.

Packet 1566 has Source Port 1041. Here we present table of 4 first Source Port seen:

Source Port # Packet Entry
1041 1533 (35 entries, 12 seconds)
1042 1568 (30 entries, 12 seconds)
1043 1598 (30 entries, 11 seconds)
1044 1628 (30 entries, ~ seconds)

 

In this pattern, we can conclude that Source Port is changing in interval between 10 and 15 seconds from previous port usage.

Goal VIII Solution ( Opening TCP Session Port 4445)

For this goal, we need to use following understanding

  • Destination IP address is 10.10.10.10
  • Destination port is 4445

Thus, we can set rule on Wireshark as following:

(ip.dst==10.10.10.10 && tcp.dstport==4445) || (ip.dst==10.10.10.70 && tcp.srcport==4445)

From goal VII we get that connection establishment are tried until packet 1656 [SYN]. In packet 1657 we see that 10.10.10.10:445 gives response [SYN,ACK] signaling that TCP three-way handshake is on the way.

TCP three-way handshake for port 4445 is finished on packet 1657 when 10.10.10.70 send packet ACK so connection is established on packet 1657.

Packet 1567 happen on 123.7 which means the connection is open on 123.7.

Goal IX Solution ( MD5 Hash of Executable File )

Ann (10.10.10.10) has initiative to send executable file to Vick (10.10.10.70), seen from packet 1659 which is [PSH,ACK] message. The message itself is sent from packet 1660. Using “Follow TCP Stream” on packet 1660, we can see whole data exchange. However the data itself is defected with following message: “[1436350345 bytes missing in capture file]”

Save the PE32 data as suspect2.exe and click “Save As”. Open up suspect2.exe using editor capable of viewing hex number (however hexeditor is recommended). Delete the first 42 bytes so we have MZ signature.

MD5 calculation gives us efbf686f09de864bd0058dab650ac34a as result of checksum.

uas_11

Goal X Solution (Closing TCP Session Port 4445)

For this goal, we need to use following understanding

  • Destination IP address is 10.10.10.10
  • Destination port is 4445

Thus, we can set rule on Wireshark as following:

(ip.dst==10.10.10.10 && tcp.dstport==4445) || (ip.dst==10.10.10.70 && tcp.srcport==4445)

Knowing TCP four-way handshake, we know that connection closed when TCP handshake is over. It is marked by ACK from initiator. However, we don’t know for sure yet who the initiator is.

Inspection lead us to pair packet 2552, 2554 and 2551,2553. Packet 2552 and packet 2553 are packet [FIN,ACK] where packet 2552 is sent first from 10.10.10.70 to 10.10.10.10. Therefore we can get conclusion that client has initiative to end the connection after server finish sending the data. Packet ACK is sent by 10.10.10.70 on entry 2554, present us to fact that connection is ended.

Packet 2554 happen on 198.4 which means the connection is closed roughly at 198.4 seconds relative to first of packet.

Conclusion

Ann Dercover has done break-in activity and has been proved planting malware to Vick Timmes.

To know what data stolen are, further investigation is needed.


Satria Ady Pradana
I3510030
School of Electrical Engineering & Informatics
Institut Teknologi Bandung

Fibonacci Numbers in C\C++

December 11, 2015 | Article | No Comments

In mathematics, Fibonacci numbers or Fibonacci sequence are the numbers in the following integer sequence:

fibonacci_1

or:

fibonacci_2

In mathematical terms, the Fibonacci sequence Fn is defined by the recurrence relation

fibonacci_3

with seed values

fibonacci_4

Our task is to write a function that returns Fn. Following are different methods to get the nth Fibonacci number

Method 1: Simple and Naive Recursion

A simple recursion that implementing the Fibonacci definition above.

int fib(int n)
{
    if (n <= 1) {
        return n;
    }
    return fib(n-1) + fib(n-2);
}

Time complexity: T(n) = T(n-1) + T(n-2) which is exponential.

This implementation does a lot of repeated work (see below).

                         fib(5)   
                     /             \     
               fib(4)                fib(3)   
             /      \                /     \
         fib(3)      fib(2)         fib(2)    fib(1)
        /     \        /    \       /    \  
  fib(2)   fib(1)  fib(1) fib(0) fib(1) fib(0)
  /    \
fib(1) fib(0)

Extra space: O(n) if we consider function call stack size. Otherwise O(1)

Method 2: Dynamic Programming – Top Down Approach

We can void the repeated work by storing the numbers calculated so far.

/* the real function to compute the fibonacci numbers */
int _fib(int * st, int n)
{
    /* if the value has not stored yet, computed it */
    if (st[n] == -1) {
        st[n] = _fib(st, n-1) + _fib(st, n-2);
    }

    return st[n];
}

/* a wrapper function */
int fib(int n)
{
    /* array to store the fibonacci numbers */
    int st[n+1];
    int res;
    int i;

    st[0] = 0;
    st[1] = 1;
    for (i=2; i<=n; i++) {
        st[i] = -1;
    }

    _fib(st, n);
    return st[n];
}

Time complexity: O(n)

Extra space: O(n)

Method 3: Dynamic Programming – Bottom Up Approach

Another approach which also store the calculated number so far.

int fib(int n)
{
    int st[n+1];
    int i;
    
    st[0] = 0;
    st[1] = 1;
    
    for (i=2; i<= n; i++)
    {
        st[i] = st[i-1] + st[i-2];
    }
    
    return st[n];
}

Time complexity: O(n)

Extra space: O(n)

Method 4: Dynamic Programming – Space Optimized Method 3

We can optimize the space used in method 3 by storing the previous two numbers only because that are all we need to get the next Fibonacci numbers.

int fib(int n)
{
    int a = 0, b = 1, c;
    int i;
    
    for (i=2; i<=n; i++)
    {
        c = a + b;
        a = b;
        b = c;
    }
    return b;
}

Time complexity: O(n)

Extra space: O(1)

Method 5: Using Matrix

This method relies on 2-dimensional system of linear difference equations for Fibonacci sequence. If we n times multiply the matrix M = {{1,1}, {1,0}} to itself (in other word calculate power(M, n)) then we get the (n+1)th Fibonacci number as the element at row and column (0,0) in the resultant matrix.

fibonacci_5

Here is how we do that:

/* Helper functions */
void multiply(int F[2][2], int M[2][2]);
void power1(int F[2][2], int n);

int fib(int n)
{
    /* We don't need to compute the matrix for these values */
    if (n<2)
        return n;
    
    int F[2][2] = {{1,1}, {1,0}};
    
    power(F, n-1);
    return F[0][0];
}

/* multiply 2 matrices F and M of size 2*2 and puts the result back to F */
void multiply(int F[2][2], int M[2][2])
{
    int w = F[0][0] * M[0][0] + F[0][1] * M[1][0];
    int x = F[0][0] * M[0][1] + F[0][1] * M[1][1];
    int y = F[1][0] * M[0][0] + F[1][1] * M[1][0];
    int z = F[1][0] * M[0][0] + F[1][1] * M[1][1];
    
    F[0][0] = w;
    F[0][1] = x;
    F[1][0] = y;
    F[1][1] = z;
}

/* designed for fib only */
void power(int F[2][2], int n)
{
    int i;
    int M[2][2] = {{1,1}, {1,0}};
    
    for (i=2; i<=n; i++)
    {
        multiply(F, M);
    }
}

Time complexity: O(n)

Extra space: O(1)

Method 6: Using Matrix – Optimized Method 5

The method 5 can be optimized to work in O(log n) time complexity. By implementing the Divide and Conquer to power and multiplication, we can do better.

/* Helper functions */
void multiply(int F[2][2], int M[2][2]);
void power(int F[2][2], int n);

int fib4(int n)
{
    if (n<2)
        return n;
    
    int F[2][2] = {{1,1}, {1,0}};
    
    power(F, n-1);
    return F[0][0];
}

/* multiply 2 matrices F and M of size 2*2 and puts the result back to F */
void multiply(int F[2][2], int M[2][2])
{
    int w = F[0][0] * M[0][0] + F[0][1] * M[1][0];
    int x = F[0][0] * M[0][1] + F[0][1] * M[1][1];
    int y = F[1][0] * M[0][0] + F[1][1] * M[1][0];
    int z = F[1][0] * M[0][0] + F[1][1] * M[1][1];
    
    F[0][0] = w;
    F[0][1] = x;
    F[1][0] = y;
    F[1][1] = z;
}

/* designed for fib only, optimized it using Divide and Conquer */
void power(int F[2][2], int n)
{
    if (n==0 || n==1)
        return;
    int M[2][2] = {{1,1}, {1,0}};

    power(F, n/2);
    multiply(F, F);
    if (n%2) {
        multiply(F, M);
    }
}

Time complexity: O(log n)

Extra space: O(log n) if we consider the function call stack size, otherwise O(1)

Programming Problem Set: 99 Problems Chapter 5

December 9, 2015 | Labs | No Comments

Ninety-nine Problems is generalized version to famous P-99: Ninety-Nine Prolog Problems collection used for teaching programming. The problems initially set for prolog but later many solutions come from various programming language. The purpose of this problem is to give us opportunity to practice our skills in logic programming. The goal is to find the most elegant solution of the given problem. Efficiency is important, but logical clarity is even more crucial.

The problem set are divided into seven categories / chapters: Lists, Arithmetic, Logic and Codes, Binary Trees, Multiway Trees, Graphs, and Miscellaneous.

In this chapter you will be only given a problem set. The solution might come however it would be on different page.

This chapter will cover about Multiway Trees. Multiway tree is composed of a root element and a (possibly) empty set of successors which are multiway trees themselves.

The problems in this chapter is a continuation of previous chapter, therefore the numbering will start from the last problem.

For the sake of writing, we will denote tree node as t (value, left, right).

63. Check whether a given object / term is a multiway tree

The predicate is succeeds if and only if its argument is a representing a multiway tree.

example:
is_tree_mp(t(a,[t(f,[t(g,[])]),t(c,[]),t(b,[t(d,[]),t(e,[])])])) -> true

64. Count the nodes of a multiway tree

Example: nnodes_m_p( t(a,[t(f,[])]),N )
 ->
2
Write another version which allows for a flow pattern

65. Tree construction from a node string

We suppose that the nodes of a multiway tree contain single characters. In the
depth-first order sequence of its nodes, a special character ^ has been inserted
whenever, during the tree traversal, the move is a backtrack to the previous level.

By this rule, the tree in the figure opposite is represented as: <tt>afg^^c^bd^e^^^</tt>

Define the syntax of the string and write a predicate tree(String,Tree) to construct
the Tree when the String is given. Work with atoms (instead of strings). Make your
predicate work in both directions.

66. Determine the internal path length of a tree

We define the internal path length of a multiway tree as the total sum of the path
lengths from the root to all nodes of the tree. By this definition, the tree in the
figure of problem 5.03 has an internal path length of 9

Write a predicate ipl(Tree,IPL) for the flow pattern (+,-).

67. Construct the bottom-up order sequence of the tree nodes

Write a predicate bottom_up(Tree,Seq) which constructs the bottom-up sequence of the
nodes of the multiway tree Tree. Seq should be a Prolog list.

What happens if you run your predicate back words?

68. Lisp-like tree representation

There is a particular notation for multiway trees in Lisp. Lisp is a prominent
functional programming language, which is used primarily for artificial intelligence
problems. As such it is one of the main competitors of Prolog. In Lisp almost
everything is a list, just as in Prolog everything is a term.

The following pictures show how multiway tree structures are represented in Lisp.

Note that in the "lispy" notation a node with successors (children) in the tree is
always the first element in a list, followed by its children. The "lispy"
representation of a multiway tree is a sequence of atoms and parentheses '(' and ')',
which we shall collectively call "tokens". We can represent this sequence of tokens as
a Prolog list; e.g. the lispy expression (a (b c)) could be represented as the Prolog
list ['(', a, '(', b, c, ')', ')']. Write a predicate tree_ltl(T,LTL) which constructs
the "lispy token list" LTL if the tree is given as term T in the usual Prolog notation.

Example:
tree_ltl_p (t(a,[t(b,[]),t(c,[])]),LTL).
-->
['(', a, '(', b, c, ')', ')']

As a second, even more interesting exercise try to rewrite tree_ltl/2 in a way that
the inverse conversion is also possible: Given the list LTL, construct the Prolog
tree T. Use difference lists.

Solution:

  1. Haskell
  2. Lisp
  3. Prolog
  4. Python

Programming Problem Set: 99 Problems Chapter 4

December 9, 2015 | Labs | No Comments

Ninety-nine Problems is generalized version to famous P-99: Ninety-Nine Prolog Problems collection used for teaching programming. The problems initially set for prolog but later many solutions come from various programming language. The purpose of this problem is to give us opportunity to practice our skills in logic programming. The goal is to find the most elegant solution of the given problem. Efficiency is important, but logical clarity is even more crucial.

The problem set are divided into seven categories / chapters: Lists, Arithmetic, Logic and Codes, Binary Trees, Multiway Trees, Graphs, and Miscellaneous.

In this chapter you will be only given a problem set. The solution might come however it would be on different page.

This chapter will cover about Binary Trees. Binary is either empty or it is composed of a root element and two successors, which are binary tree themselves.

The problems in this chapter is a continuation of previous chapter, therefore the numbering will start from the last problem.

For the sake of writing, we will denote tree node as t (value, left, right).

45. Check whether a given object / term is a binary tree

The predicate is succeeds if and only if its argument is a representing a binary tree.

example:
is_tree_p( t(a, t(b, null, null), null) -> true

46. Construct a completely balanced binary trees

In a completely balanced binary tree, the following property holds for every node: The number
of nodes in its left subtree and the number of nodes in its right subtree are almost equal,
which means their difference is not greater than one.

Write cbal_tree/2 to construct completely balanced binary trees for a given number of nodes.
The predicate should generate all solutions via backtracking. Put the letter 'x' as information
into all nodes of the tree.

Example: cbal_tree_p( 4 )
 ->
t(x, t(x, null, null), t(x, null, t(x, null, null)))
t(x, t(x, null, null), t(x, t(x, null, null), null))
...etc

47. Symmetric binary trees

Let us call a binary tree symmetric if you can draw a vertical line through the root node
and then the right subtree is the mirror image of the left subtree. Write symmetric/1 to check
whether a given binary tree is symmetric.

Hint: Write mirror/2 first to check whether one tree is the mirror image of another.
We are only interested in the structure, not in the contents of the nodes. 48. Binary search trees (dictionaries)
Write a predicate to construct a binary search tree from a list of integer numbers.

Example:
construct_p( [3,2,5,7,1] ).
-> t(3, t(2, t(1, nil, nil), nil), t(5, nil, t(7, nil, nil)))

49. Generate and test paradigm

Apply the generate-and-test paradigm to construct all symmetric, completely balanced binary trees
with a given number of nodes.

Example: sym_cbal_trees_p( 5 )
 ->
[t(x, t(x, nil, t(x, nil, nil)), t(x, t(x, nil, nil), nil)), t(x, t(x, t(x, nil, nil), nil),
 t(x, nil, t(x, nil, nil)))]

How many such trees are there with 57 nodes? Investigate about how many solutions there are for a
given number of nodes. What if the number is even? Write the appropriate solution.

50. Construct height-balanced binary trees

In a height-balanced binary tree, the following property holds for every node:
The height of its left subtree and the height of its right subtree are almost equal, which means their
difference is not greater than one.

Write a predicate hbal_tree/2 to construct height-balanced binary trees for a given height.
The predicate should generate all solutions via backtracking. Put the letter 'x' as information into
all nodes of the tree.

Example: hbal_p( 3 )
 ->
t(x, t(x, t(x, nil, nil), t(x, nil, nil)), t(x, t(x, nil, nil), t(x, nil, nil)))
t(x, t(x, t(x, nil, nil), t(x, nil, nil)), t(x, t(x, nil, nil), nil))
...etc

51. Construct height-balanced binary trees with a given number of nodes

Consider a height-balanced binary tree of height H. What is the maximum number of nodes it can contain?
Clearly, MaxN = 2^H - 1. However, what is the minimum number MinN? This question is more difficult.
Try to find a recursive statement and turn it into a predicate minNodes/2 defined as follwos:

minNodes_p( H ) -> the minimum number of nodes in a height -balanced binary tree of height H.

On the other hand, we might ask: what is the maximum height H a height-balanced binary tree with N nodes
can have?

maxHeight_p( N ) -> the maximum height of a height balanced binary tree with N nodes

Now, we can attack the main problem: construct all the height-balanced binary trees with a given number
of nodes.

hbal_tree_nodes_p( N ) -> height-balanced binary tree with N nodes.

Find out how many height-balanced trees exist for N = 15

52. Count the leaves of a binary tree

A leaf is a node with no successors.

Example: count_leaves_p( t(a, t(b, t(c, null, null), t(d, null, null)), t(e, null, null)) )
 -> 3

The leaf are [ t(c, null, null), t(d, null, null)), t(e, null, null)) ]

53. Collect the leaves of a binary tree in a list

A leaf is a node with no successors.

Example:
leaves_p( t(a, t(b, t(c, null, null), t(d, null, null)), t(e, null, null)) )
 ->
[ t(c, null, null), t(d, null, null)), t(e, null, null)) ]

54. Collect the internal nodes of a binary tree in a list

An internal node of a binary tree has either one or two non-empty successors.
Example: internals_p( t(a, t(b, t(c, null, null), t(d, null, null)), t(e, null, null)) )
->
[ t(a, t(b, t(c, null, null), t(d, null, null)), t(e, null, null)),
  t(b, t(c, null, null), t(c, null, null)) ]

55. Collect the nodes at a given level in a list

A node of a binary tree is at level N if the path from the root to the node has length N-1.
The root node is at level 1.

Example: atlevel_p( 2, t(a, t(b, t(c, null, null), t(d, null, null)), t(e, null, null)) )
->
[ t(b, t(c, null, null), t(d, null, null)), t(e, null, null)) ]

56. Construct a complete binary tree

A complete binary tree with height H is defined as follows:
The levels 1,2,3,...,H-1 contain the maximum number of nodes (i.e 2**(i-1) at the level i,
note that we start counting the levels from 1 at the root). In level H, which may contain less
than the maximum possible number of nodes, all the nodes are "left-adjusted". This means that in a
levelorder tree traversal all internal nodes come first, the leaves come second, and empty successors
(the nil's which are not really nodes!) come last.

Particularly, complete binary trees are used as data structures (or addressing schemes) for heaps.

We can assign an address number to each node in a complete binary tree by enumerating the nodes in
levelorder, starting at the root with number 1. In doing so, we realize that for every node X with
address A the following property holds: The address of X's left and right successors are 2*A and 2*A+1,
respectively, supposed the successors do exist. This fact can be used to elegantly construct a complete
binary tree structure. Write a predicate complete_binary_tree/2 with the following specification:

Example: complete_binary_tree_p( 3 ) -> The complete binary tre with N nodes

57. Layout a binary tree

Given a binary tree as the usual Prolog term t(X,L,R) (or nil). As a preparation for drawing the tree,
a layout algorithm is required to determine the position of each node in a rectangular grid.
Several layout methods are conceivable, one of them is shown in the illustration above.
In this layout strategy, the position of a node v is obtained by the following two rules:
(1) x(v) is equal to the position of the node v in the inorder
(2) y(v) is equal to the depth of the node v in the tree sequence

58. Layout a binary tree (2)

Figure located above

An alternative layout method is depicted in the above illustration. Find out the rules and write
the corresponding solution.
Hint: On a given level, the horizontal distance between neighboring nodes is constant.

Use the same conventions as in problem 57.

59. Layout a binary tree (3)

Figure located above

Yet another layout strategy is shown in the above illustration. The method yields a very compact
layout while maintaining a certain symmetry in every node. Find out the rules and write the
solution.
Hint: Consider the horizontal distance between a node and its successor nodes.
How tight can you pack together two subtrees to construct the combined binary tree?

Use the same conventions as in problem 57 and 58 and test your predicate in an appropriate way.
Note: This is a difficult problem. Don't give up too early!

Which layout do you like most?

60. A string representation of binary trees

Somebody represents binary trees as strings of the following type (see example):

a(b(d,e),c(,f(g,)))

(a) Write a solution which generates this string representation, if the tree is given as
usual (as nil or t(X,L,R) term). Then write a predicate which does this inverse;
i.e. given the string representation, construct the tree in the usual form.
Finally, combine the two predicates in a single predicate tree_string/2 which can be used in
both directions.

(b) Write the same predicate tree_string/2 using difference lists and a single predicate
tree_dlist/2 which does the conversion between a tree and a difference list in both directions.

For simplicity, suppose the information in the nodes is a single letter and there are no
spaces in the string.

61. Preorder and inorder sequence of binary trees

We consider binary trees with nodes that are identified by single lower-case letters, as in the
example of problem 60.

(a) Write predicates preorder/2 and inorder/2 that construct the preorder and inorder sequence of a
given binary tree, respectively. The results should be atoms, e.g. 'abdecfg' for the preorder sequence
of the example in problem 4.16.

(b) Can you use preorder/2 from problem part a) in the reverse direction;
i.e. given a preorder sequence, construct a corresponding tree? If not, make the necessary arrangements.

(c) If both the preorder sequence and the inorder sequence of the nodes of a binary tree are given,
then the tree is determined unambiguously. Write a predicate pre_in_tree/3 that does the job.

(d) Solve problems (a) to (c) using difference lists.

What happens if the same character appears in more than one node. Try for instance pre_in_tree(aba,baa,T).

62. Dotstring representation of binary trees

We consider again binary trees with nodes that are identified by single lower-case letters, as in
the example of problem 60. Such a tree can be represented by the preorder sequence of its nodes in
which dots (.) are inserted where an empty subtree (null) is encountered during the tree traversal.
For example, the tree shown in problem 60 is represented as <tt>'abd..e..c.fg...'</tt>.
First, try to establish a syntax (BNF or syntax diagrams) and then write a predicate tree_dotstring/2
which does the conversion in both directions. Use difference lists.

Solution:

  1. Haskell
  2. Lisp
  3. Prolog
  4. Python

Programming Problem Set: 99 Problems Chapter 3

December 9, 2015 | Labs | No Comments

Ninety-nine Problems is generalized version to famous P-99: Ninety-Nine Prolog Problems collection used for teaching programming. The problems initially set for prolog but later many solutions come from various programming language. The purpose of this problem is to give us opportunity to practice our skills in logic programming. The goal is to find the most elegant solution of the given problem. Efficiency is important, but logical clarity is even more crucial.

The problem set are divided into seven categories / chapters: Lists, Arithmetic, Logic and Codes, Binary Trees, Multiway Trees, Graphs, and Miscellaneous.

In this chapter you will be only given a problem set. The solution might come however it would be on different page.

This chapter will cover about Logic and Codes. The problems in this chapter served as continuation of previous problems, therefore the numbering will start from the last problem.

40. Truth tables for logical expression

Define and/2, or/2, nand/2, nor/2, xor/2, impl/2 and equ/2 (for logical equivalence) which
succeed or fail according to the result of their respective operations;
e.g. and(A,B) will succeed, if and only if both A and B succeed.

Now, write a predicate table/3 which prints the truth table of a given logical expression in two variables.

Example: table_p( A, B, xor(A,B) )
->
true true false
true false true
false true true
false false false

41. Truth tables for logical expressions (2)

Continue problem 40 by defining and/2, or/2, etc as being operators. This allows to write the
logical expression in the more natural way, as in the example: A and (A or not B).
Define operator precedence as usual; i.e. as in C.

Example: table2_p( A, B, A and (A or not B) )
 ->
true true true
true false true
false true false
false false false

41. Truth tables for logical expressions (2)

Generalize problem 41 in such a way that the logical expression may contain any number
of logical variables. Define table/2 in a way that table(List,Expr) prints the truth table
for the expression Expr, which contains the logical variables enumerated in List.

The index is started from 1.

Example: table3_p( [A,B,C], A and (B or C) equ A and B or A and C)
 ->
<tt>true true true true
true true false true
true false true true
true fail false true
false true true true
false true false true
false false true true
false false false true

43. Gray code

An n-bit Gray code is a sequence of n-bit strings constructed according to
certain rules. For example,

n = 1: C(1) = ['0','1'].
n = 2: C(2) = ['00','01','11','10'].
n = 3: C(3) = ['000','001','011','010','110','111','101','100'].

Find out the construction rules and write a predicate with the following specification:

% gray(N,C) :- C is the N-bit Gray code

Can you apply the method of "result caching" in order to make the predicate more
efficient, when it is to be used repeatedly?

44. Huffman code

First of all, study a good book on discrete mathematics or algorithms for a detailed
description of Huffman codes, or consult wikipedia

We suppose a set of symbols with their frequencies, given as a list of fr(S,F) terms.
Example: [fr(a,45),fr(b,13),fr(c,12),fr(d,16),fr(e,9),fr(f,5)]. Our objective is to
construct a list hc(S,C) terms, where C is the Huffman code word for the symbol S.
In our example, the result could be Hs = [hc(a,'0'), hc(b,'101'), hc(c,'100'), hc(d,'111'),
hc(e,'1101'), hc(f,'1100')] [hc(a,'01'),...etc.].

Example: huffman_p(Fs,Hs)
-> Hs is the Huffman code table for the frequency table Fs

Solution:

  1. Haskell
  2. Lisp
  3. Prolog
  4. Python

Programming Problem Set: 99 Problems Chapter 2

December 9, 2015 | Labs | No Comments

Ninety-nine Problems is generalized version to famous P-99: Ninety-Nine Prolog Problems collection used for teaching programming. The problems initially set for prolog but later many solutions come from various programming language. The purpose of this problem is to give us opportunity to practice our skills in logic programming. The goal is to find the most elegant solution of the given problem. Efficiency is important, but logical clarity is even more crucial.

The problem set are divided into seven categories / chapters: Lists, Arithmetic, Logic and Codes, Binary Trees, Multiway Trees, Graphs, and Miscellaneous.

In this chapter you will be only given a problem set. The solution might come however it would be on different page.

This chapter will cover about Arithmetic. A list is either empty or it is composed of a first element (head) and a tail, which is a list itself. As a continuation from previous chapter, the problem will be started from last previous number.

29. Determine whether a given integer number is prime.

Example: is_prime_p( 7 ) -> Yes

30. Determine the prime factors of a given positive integer.

Construct a list containing the prime factors in ascending order

Example: prime_factor_p( 315 ) -> [ 3, 3, 5, 7 ]

31. Determine  the prime factors of a given positive integer (2)

Construct a list containing the prime factors and their multiplicity.

Example: prime_factor2_p( 315 ) -> [ [3,2], [5,1], [7,1] ]

Hint: The solution of problem 10 may be helpful.

32. A list of prime number

Given a range of integers by its lower and upper limit, construct
a list of all prime numbers in that range.

Example: prime_list_p( 3, 15 ) -> [3, 5, 7, 11, 13 ]

33. Goldbach’s conjecture

Goldbach's conjecture says that every positive even number greater
than 2 is the sum of two prime numbers. Example: 28 = 5 + 23.
It is one of the most famous facts in number theory that has not
been proved to be correct in the general case. It has been numerically
confirmed up to very large numbers.
Find the two prime numbers that sum up to a given even integer

Example: goldbach_p( 28 ) -> [ 5, 23]

34. A list of Goldbach compositions

Given a range of integers by its lower and upper limit, print a
list of all even numbers and their Goldbach composition

Example: goldbach_list_p( 9, 20 )
10 = 3 + 7
12 = 5 + 7
14 = 3 + 11
16 = 3 + 13
18 = 5 + 13
20 = 3 + 17

In most case, if an even number is written as the sum of two prime
numbers, one of them is very small. Very rarely, the primes are
both bigger than say 50. Try to find out how many such cases
there are in the range 2..3000.

35. Determine the greatest common divisor of two positive integer number

Use Euclid's algorithm

Example: gcd_p( 36, 63 ) -> 9

36. Determine whether two positive integer numbers are coprime

Two numbers are coprime if their greates common divisor equals 1

Example: coprime_p( 35, 64 ) -> Yes

37. Calculate Euler’s totient function phi(m)

Euler's so-called totient phi(m) is defined as the number of pisitive
integers r (1 <= r < m) that are coprime to m.
If m = 10 then r = 1, 3, 7, 9; thus phi(m) = 4. Note the special case
phi(1) = 1

Example: phi_p( 10 ) -> 4

38. Calculate Euler’s totient function phi(m) (2)

See the previous problem for definition of Euler's totient function.
If the list of the prime factors of a number m is known in the form
of problem 32 then the function phi(m) can be efficiently calculated
as follows:

Let [[p1, m1], [p2, m2], [p3, m3], ...] be the list of prime factors
(and their multiplicities) of a given number m. Then phi(m) can be
calculated with following formula:

phi(m) = (p1-1)* p1^(m1-1) *(p2-1)* p2^(m2-1)*(p3-1)* p3^(m3-1)

Note that a^b stands for the b'th power of a.

39. Compare the two methods of calculating Euler’s totient function.

Use the solution of problem 37 and 38 to compare algorithm. Take
the number of logical inferences as a measure for efficiency. Try to
calculate phi(10090) as an example

Solution:

  1. Haskell
  2. Lisp
  3. Prolog
  4. Python

Programming Problem Set: 99 Problems Chapter 1

December 9, 2015 | Labs | No Comments

Ninety-nine Problems is generalized version to famous P-99: Ninety-Nine Prolog Problems collection used for teaching programming. The problems initially set for prolog but later many solutions come from various programming language. The purpose of this problem is to give us opportunity to practice our skills in logic programming. The goal is to find the most elegant solution of the given problem. Efficiency is important, but logical clarity is even more crucial.

The problem set are divided into seven categories / chapters: Lists, Arithmetic, Logic and Codes, Binary Trees, Multiway Trees, Graphs, and Miscellaneous.

In this chapter you will be only given a problem set. The solution might come however it would be on different page.

This chapter will cover about Lists. A list is either empty or it is composed of a first element (head) and a tail, which is a list itself.

01. Find the last element of a list

Example: last_p( [a, b, c, d] ) -> d

02. Find the last but one element of a list

Example: last_one_p( [a, b, c, d] ) -> c

03. Find the n’th element of a list

The index is started from 1.

Example: at_p( [a, b, c, d], 3) -> c

04. Find the number of elements of a list

Example: length_p( [a, b, c, d] ) -> 4

05. Reverse a list

Example: reverse_p( [a, b, c, d] ) -> [d, c, b, a]

06. Find out whether a list is a palindrome

A palindrome ca be read forward of backward.

Example: palindrom_p( [a, b, c, b, a] ) -> true

07. Flatten a nested list structure

Transform a lists as elements into a 'flat'
list by replacing each list with its element (recursively)

Example: flatten_p( [a, [b, [c, d], e]] ) -> [a, b, c, d, e]

08. Eliminate consecutive duplicates of list elements

If a list contains repeated elements they should be replaced
with a single copy of the element. The order of elements
should not be changed.

Example: compress_p( [a, a, a, a, b, c, c, a, a, d, e, e, e, e] ) -> [a, b, c, a, d, e]

09. Pack consecutive duplicates of list elements into sublists

If a list contains repeated elements, they should be placed
in separate sublist.

Example: pack_p( [a, a, a, a, b, c, c, a, a, d, e, e, e, e])
 -> [ [a, a, a, a], [b], [c, c], [a, a], [d], [e, e, e, e,] ]

10. Run-length encoding of a list

Consecutive duplicates of elements are encoded as terms [N, E]
where N is the number of duplicates of the element E.

Example: encode_p1( [a, a, a, a, b, c, c, a, a, d, e, e, e, e] )
 -> [ [4,a], [1,b], [2,c], [2,a], [1,d], [4,e]]

11. Modified run-length encoding

Consecutive duplicates of elements are encoded as terms [N, E]
where N is the number of duplicates of the element E. If an
element has no duplicates, it is simply copied into the result
list Only duplicates are transferred as [N, E] terms.

Example: encode_p2( [a, a, a, a, b, c, c, a, a, d, e, e, e, e] )
 -> [ [4,a], b, [2,c], [2,a], d, [4,e] ]

12. Decode a run-length encoding list

Given a run-length code list generated, construct its uncompressed
version.

Example: encode_p3( [4,a], [1,b], [2,c], [2,a], [1,d], [4,e])
 -> [a, a, a, a, b, c, c, a, a, d, e, e, e, e] ]

13. Run-length encoding of a list (direct solution)

Implement the so-called run-length encoding data compression method
directly. Don't explicitly create the sublists containing the
duplicates but only count them. Simplify the result list by
replacing the singleton terms [1,X] by X.

Example: encode_p4([a, a, a, a, b, c, c, a, a, d, e, e, e, e])
 -> [ [4,a], b, [2,c], [2,a], d, [4,e]]

14. Duplicate the elements of a list

Example: duplicate_p([a, b, c, d, e]) -> [a, a, b, b, c, c, d, d, e, e]

15. Duplicate the elements of a list a given number of times.

Example: duplicate_px([a, b, c, d, e], 3)
 -> [a, a, a, b, b, b], c, c, c, d, d, d, e, e, e]

16. Drop everh N’th element from a list

Example: drop_p([a, b, c, d, e, f, g, h, i, k], 3)
 -> [a, b, d, e, g, h, k]

17. Split a list into two parts; the length of the first part is given

Do not use any predefined predicates / function.

Example: split_p([a, b, c, d, e, f, g, h, i, k], 3, L1, L2)
 -> L1 = [a, b, c] ; L2 = [d, e, f, g, h, i, k]

18. Extract a slice from a list

Given two indices, I and K, the slice is the list containing the
elements between the I'th and K'th element of the original list
(both limits included). Start counting the element with 1.

Example: slice_p([a, b, c, d, e, f, g, h, i, k], 3, 7) -> [c, d, e, f, g]

19. Rotate a list N places to the left

Example:
rotate_p([a, b, c, d, e, f, g, h], 3) -> [d, e, f, g, h, a, b, c]
rotate_p([a, b, c, d, e, f, g, h],-2) -> [g, h, a, b, c, d, e, f]

20. Remove the K’th element from a list

Example: remove_p([a, b, c, d], 2) -> [a, c, d]

21. Insert an element at a given position into a list

Example: insert_p(x, [a, b, c, d], 2) -> [a, x, b, c, d]

22. Create a list containing all integers withing a given range.

Example: range_p(4, 9) -> [4, 5, 6, 7, 8, 9]

23. Extract a given number of randomly selected elements from a list.

The selected items shall be put into a result list.

Example: rnd_select_p1([a, b, c, d, e, f, g, h], 3) -> [g, a, c]

24. Draw N different random numbers from the set 1..M.

The selected numbers shall be put into a result list.

Example: rnd_select_p2(6, 49) -> [23, 1, 33, 21, 37, 17]

25. Generate a random permutation of the elemnts of a list.

Example: rnd_permut_p([a, b, c, d, e]) -> [b, a, d, c, e, f]

26. Generate the combinations of K distinct objects chosen from the N elements of a list.

In how many ways can a committee of 3 be chosen from a group of
12 people? There are C(12,3) = 220 possibilities denotes the
well-known binomial coefficients.

Example: combination_p([a, b, c, d, e, f])
[a, b, c]
[a, b, d]
[a, b, e]
...

27. Group the elements of a set into disjoint subsets.

(a) In how many ways can a group of 9 people work in 3 disjoint subgroups of 2, 3, and 4 persons?
Example: group3_p([aldo,beat, carla, david, evi, flip, gary, hugo, ida])

G1 = [aldo, beat], G2 = [carla, david, evi], G3 = [flip, gary, hugo, ida]


(b) Generalize the above function in a way that we can specify a list of group sizes and the 
predicate will return a list of groups

Example: group_p([aldo, beat, arla, david, evi, flip, gary, hugo, ida],[2, 2, 5])
 -> [[aldo,beat],[carla,david],[evi,flip,gary,hugo, ida]]
[/sourcecode]

28. Sorting a list of lists according to length of sublists

(a) Suppose that a list (InList) contains elements that are lists themselves.
 The objective is to sort the elements of InList according to their length.
 E.g. short lists first, longer lists later, or vice versa.


Example: lsort_p([[a, b, c], [d,e],[f,g,h], [d,e], [i,j,k,l], [m,n],[o]])
 -> [[o],[d, e],[d, e],[m, n],[a, b, c],[f, g, h],[i, j, k, l]]




(b) Suppose a list (InList) contains elements that are list themselves.
 But this time the objective is to sort the elements of InList according to
 their length frequency. i.e. in the default, where sorting is done ascendingly,
 lists with rare lengths are placed first, others with a more frequent length
 come later.

Example: lfsort_p([[a, b, c], [d,e],[f,g,h], [d,e], [i,j,k,l], [m,n],[o]])
 -> [[i, j, k, l],[o],[a, b, c],[f, g, h],[d, e],[d, e],[m, n]]

Note that in the above example, the first two list in the result have length
4 and 1, both lengths appear just one. The third and forth list have length
3, there are two list of this length. And finally the last three lists have
length 2. This is the most frequent length
[/sourcecode]

Solution:

  1. Haskell
  2. Lisp
  3. Prolog
  4. Python

Social media & sharing icons powered by UltimatelySocial