Tag Archive : socket

/ socket

In previous article, we have discussed some basic for network programming such as IP, port number, and byte order of computer. In this article we will look some API and its basic usage.

Converting Byte Order

We have know that each computer might have different endian but the network infrastructure ensure format in network order. To convert our host order to network order (or vice versa) we can use some function:

htons()
htonl()
ntohs()
ntohl()

Some Structs

We will discussed some struct used here. Try to remember them, okay?

struct addrinfo – This structure is more recent invention and is used to prepare the socket address structures for subsequent use. It’s also used in host name lookups, and service name lookups.

struct addrinfo {
   int              ai_flags;     // AI_PASSIVE, AI_CANONNAME, etc
   int              ai_family;    // AF_INET, AF_INET6, AF_UNSPEC
   int              ai_socktype;  // SOCK_STREAM, SOCK_DGRAM
   int              ai_protocol   // use 0 for "any"
   size_t           ai_addrlen;   // size of ai_addr in bytes
   struct sockaddr *ai_addr;      // can be struct sockaddr_in or sockaddr_in6
   char            *ai_canonname; // full canonical hostname
   struct addrinfo *ai_next;	  // linked list, next node
};

Load this struct up a bit and then call getaddrinfo(). It’ll return a pointer to a new linked list of these structures filled out with all the information needed. We can force it to use IPv4 or IPv6 or leave it as AF_UNSPEC to use whatever.

struct sockaddr – This structure hold information about address.

struct sockaddr {
   unsigned short  sa_family;    // address family, AF_XXX
   char            sa_data[14];  // 14 bytes of protocol address;
};

sa_family can be variety, but it’ll be AF_INET or AF_INET6 (you must have known the difference when I write that ;p . sa_data containes destination address and port number for socket.

struct sockaddr_in – structure for specific use, IPv4. Make reference elements of the socket address. Please remember that sin_zero must be set to all zero with memset() function.

struct sockaddr_in {
   short int          sin_family;    // Address family, AF_INET
   unsigned shot int  sin_port;      // Port number
   struct in_addr     sin_addr;      // Internet address
   unsigned char      sin_zero[8];   // filled with zero to keep same size as sockaddr
};

struct in_addr – structure for hold internet address (IPv4)

struct in_addr {
   uint32_t s_addr;    // 32 bit int
};

struct sockaddr_in6 – structure for specific use, IPv6. Similiar to IPv4 version

struct sockaddr_in6 {
   u_int16_t        sin6_family;    // Address family, AF_INET6
   u_int16_t        sin6_port;	    // Port number, network byte
   u_int16_t        sin6_flowinfo;  // IPv6 flow information
   struct in6_addr  sin6_addr;      // IPv6 address
   u_int32_t        sin6_scope_id;  // Scope ID
};

struct in6_addr – structure for hold internet address (IPv6)

struct in6_addr {
   unsigned char s6_addr[16];   // IPv6 address
};

struct sockaddr_storage – structure large enough to hold both IPv4 and IPv6.

struct sockaddr_storage {
   sa_family_t  ss_family;	// address family

   // all this is padding, implementation specific
   char         __ss_pad1[_SS_PAD1SIZE];
   int64_t      __ss_align;
   char         __ss_pad2[_SS_PAD2SIZE];
};

The important one is the ss_family field, we can check it for AF_INET or AF_INET6. Then we can cast it to struct sockaddr_in or struct sockaddr_in6 if we want.

IP Address Representation

There are a bunch of function to manipulate IP address. If we have IP address “10.12.110.57” and “2001:db8:63b3:1::3490” we will use inet_pton() function to converts them to their respective address representation. To reverse it (convert to internet address representation to string), use inet_ntop() function.

#include <arpa/inet.h>

struct sockaddr_in sa;	// IPv4
struct sockaddr_in6 sa6	// IPv6
char ip4[INET_ADDRSTRLEN];
char ip6[INET6_ADDRSTRLEN];

inet_pton(AF_INET,"10.12.110.57",&(sa.sin_addr));	// IPv4
inet_pton(AF_INET6,"2001:db8:63b3:1::3490", &(sa6.sin6_addr));	// IPv6

inet_ntop(AF_INET, &(sa.sin_addr), ip4, INET_ADDRSTRLEN);
printf("IPv4 Address is: %s\n", ip4);
inet_ntop(AF_INET6, &(sa6.sin6_addr), ip6, INET6_ADDRSTRLEN);
printf("IPv6 Address is: %s\n", ip6);

Socket Descriptor

As we discussed in some articles ago, a socket in Unix is a file associated with a integer. To create a socket we need create a variable for socket which can be as simple as integer.

Socket Creation

To create a socket we will call a function socket(). Function socket() has 3 arguments: a family type (AF_INET / AF_INET6), packet type (SOCK_STREAM / SOCK_DGRAM), protocol type.

#include <sys/types.h>
#include <sys/socket.h>

int socket_fd = socket(AF_INET, SOCK_STREAM, 0);

Socket Bind to

Socket has been created on creation step and stored in socket_fd. But it would be meaningless if we cannot use it for communication because it has not bind into any port. Bind a port mean attach our socket to port and create a “tunnel” for our application to the port, exclusively. Once the socket bounded to a port, other program cannot use same port until our socket is unattached from the port. The information needed would be stored in struct sockaddr_* respective to type of address it want to use (IPv4 or IPv6).

#include <sys/types.h>
#include <sys/socket.h>

struct sockaddr_in sin;

memset(sin.sin_zero,0,sizeof(sin));
sin.sin_family = AF_INET;
sin.sin_port = htons(13510);
inet_pton(AF_INET, "127.0.0.1", &(sa.sin_addr));

bind( socket_fd, (struct sockaddr*) &sin, sizeof(sin) );

Socket Listen to

The bounded socket can listen to some incoming connections. A function listen() will set the maximum number of connection it can handle.

#include <sys/types.h>
#include <sys/socket.h>

listen( socket_fd, 20 );

Socket Accept request

Once a listening socket got an incoming packet, it is up to programmer to accept it or drop it. To accept incoming packet we use accept function and create a new file descriptor. The data can be fetched from new file descriptor (remember that everything in Unix is file?). Peer is remote computer who initiate communication.

#include <sys/types.h>
#include <sys/socket.h>

struct sockaddr* peer_addr;

int accfd = accept( socket_fd, (struct sockaddr*) peer_addr, sizeof(peer_addr) );

Socket Receive message

After accepting connection, we have a new file descriptor that store message. The next thing to do is fetching the data.

#include <sys/types.h>
#include <sys/socket.h>

char message[1024];
int nbytes;

nbytes = recv(accfd, message, 1024, 0);

Socket Connect to

A socket can also used for sending message or simply connecting to another host. The open socket, socket created from socket(), is unnecessary to bounded to any port. It only need information about address it want to connect.

#include <sys/types.h>
#include <sys/socket.h>
#include <stdlib.h>

struct sockaddr_in sin;

memset( &sin.sin_zero, 0, sizeof(sin) );
sin.sin_family = AF_INET;
sin.sin_port = htons(13510);
inet_pton(AF_INET, "127.0.0.1", &(sin.sin_addr));

int result = connect( socket_fd, (struct sockaddr*) &addr, sizeof(addr) );

Socket Send message

When connection established, we can send message over our socket.

#include <sys/types.h>
#include <sys/socket.h>
#include <string.h>

char message[1024];
strcpy(message,"To server, hello");
send(sock_fd, message,1024,0)

Last article, we have discussed about what is socket. This article will give a quick presentation about network before start learning socket programming.

Internet Protocol (IP) Address and Subnetwork

Internet Protocol (IP) Address or IP Address is simply an address. Like address in real world, IP address determine how your device can be reach from other device. IP address set a complex message transferring system in internet which is called routing, and responsible to route data packets from one end to another end.

There are 2 type of IP address used in public: IPv4 (IP version 4) and IPv6 (IP version 6).

IPv4 has 32 bit length with representation series of decimal numbers separated by dot. For example 192.168.1.1 is IPv4 address. With IPv4 there would be 2^32 addresses or 4,294,967,296 (approximately 4 billion address). These addresses are currently running out so an alternative address named IPv6 is developed.

IPv6 has 128 bit length with representation series of hexadecimal numbers separated by colon. For example 2001:0db8:c9d2:0012:0000:0000:0000:0051. With IPv6 there would be 2^128 addresses for entire planet. This is relatively huge, far more huge than older IPv4 way.

There is an organization Internet Assigned Number Authority (IANA, http://www.iana.org/) that manages global IP allocation and delegates it to five Regional Internet Registries (RIRs) to allocate IP block to Local Internet Registries (LIRs) and other organization. So RIR are IANA’s delegation to manages all IP address around the world. Those 5 regions are:

  1. African Network Information Center (AfriNIC) for Afrika.
  2. American Registry for Internet Numbers (ARIN) for United States, Canada, and some region of Caribbean.
  3. Asia-Pacific Network Information Center (APNIC) for Asia, Australia, New Zealand, and surrounding nation.
  4.  Latin America and Caribbean Network Information Center (LACNIC) for Latin America and some region of Caribbean.
  5. Réseaux IP Européens Network Coordination Centre (RIPE) for Europe, Middle East, and Middle Asia.

While LIRs are organization in every nation. They allocate IP address for customer in each nation but still under supervisor of their own RIR.

Subnetwork or subnet is a logically visible subdivision of an IP network. All computers that belong to a subnet are addressed with a common, identical, most-significant bit-group in their IP address. This results in the logical diviion of an IP address into two fiels: a network or routing prefix and the rest field or host identifier. The rest field is identifier for a specific host or network interface.

The routing prefix is expressed in CIDR notation and written as the first address of a network followed by a slash character (/), and ending with the bit-length of prefix. For example 202.168.1.0/24 is the prefix of the IPv4. The example for IPv6 is 2001:db8::/32 is a large network with 2^96 address, having a 32-bit routing prefix. In IPv4 the routing prefix is also specified in the form of the subnet mask, which is expressed in quad-dotted decimal representation like an address, for example: 255.255.255.0 is network mask for 24-bit routing prefix.

Port Numbers

In computer network, a port is a logical gateway to exchange information to the world. Like a port in real world, incoming and outgoing packet are visiting port each time. Therefore it is a vital part for communication.

Port Number is a number that associated with kernel. A port number is 16-bit number that is like local address for the connection.

We can associate IP address like address of a building (apartment), while port number is room number for each attendant (a program). A message passed from one program to another program can be in same building (same machine) or different building (different machine) via postal agent (internet).

Port number is also differentiate services on a server, such as mail server and web server.

Different services on internet have different well-known post numbers. There is a big list maintained by IANA or if you are on Unix machine you can see the list on /etc/services file. HTTP (web) is port 80, telnet is 23, SMTP is 25, and so on.

Byte Order

There are two way computer might store number, in Big-Endian or Little-Endian. A Big-Endian system store number with big end first. While Little-Endian is the reverse. Intel and Intel-compatible processor store data in Little-Endian. Therefore, a number in hexadecimal 0xb34f would be stored in memory as sequential 4f followed by b3. In Big-Endian, 0xb34f would be stored in memory in two sequential bytes b3 followed by 4f.

The computer dependent number presentation (can be Little-Endian or Big-Endian depend on processor) is called Host Byte Order. The data across network is standardized using Big-Endian for clarity. The byte order for network is  called Network Byte Order.

What is Socket?

November 24, 2015 | Article | No Comments

In the Computer Science world, especially in networking, socket play a great role. Socket is an interface for our computer programs in order to communicate each other in this vast world. Simply it is a way to “speak” using standard Unix file descriptor. I said standard unix descriptor and yes it is basically same like file handling like create, open, write, read but with some more capabilities.

“Everything in Unix is a file!”

You may have heard those statement, well at least you have read a minute ago in this article. And yes, everything in Unix is a file. Unix do any sort of I/O by reading or writing to a file descriptor. A file descriptor is simply an integer associated with an open file. A file is Unix can be anything. It can be real file on disk, a network connection, a FIFO, a pipe, a terminal, and so on. So when you want to connect to another program over Internet you will do it through a file descriptor.

There are many kinds of sockets. There are DARPA Internet address (Internet Sockets), path names on a local node (Unix Sockets), CCITT X.25 (X.25 Sockets), and probably many others depending on which Unix subsystem. The most common is Internet Sockets as it is used for internet nowaday.

There are two commons type of internet sockets: SOCK_STREAM (stream socket) and SOCK_DGRAM (datagram socket).The types refer to connection state of them. SOCK_STREAM Is connection oriented socket while SOCK_DGRAM is a connectionless socket. They are differ from the behaviour the transmission they have to do.

A connection oriented is like its name, intend to be reliable using connection. Every packet transmitted using this socket has to be acknowledged. A receiver must send a message saying “Hey, I have receive your message” to sender so that sender know that its message has arrived at receiver end. While in the connectionless stream, a sender just send the data without requiring acknowledged message from receiver.

A connection oriented is needed if you need transferred data arrived in exactly same order. It might be bad if you send a file and your data on 128th byte is swapped with 129th because of delay. The connection-oriented ensure packet to arrive in exact order with exact size.

A connectionless is needed if you need transferred real time data regardless of order. You might consider streaming is one of example. One or two packet arrive late (arrive out of order or maybe not arrive at all) is not a problem while you get real-time update of video (a football match maybe).

Unix and it’s successor have sockets. Windows too, in it’s own way, has socket. Every modern OS must have socket as a standard way to communicate.

Next I will show you how to socket-programming in general way.

Social media & sharing icons powered by UltimatelySocial