Moving Data through slow media
If you have Cookies enabled,
you can check to see if any files you have downloaded
have been updated since you downloaded them
by clicking here.
|
|
My intention is to provide simple, free access to useable solutions to common but difficult programming problems.
If you know any better solutions then please e-mail me so that everyone can share your wisdom!
Be aware that I use a non-standard "Coding Style"... and its infectious...
|
The later code sections on this page get increasingly complex as many classes get used together to solve new problems.
All the classes used have explanations on this site and should be simple to understand on their own.
Windows Sockets
Windows Sockets made easy
Don't specify a "Windows Sockets Application" when creating your Project with AppWizard for any of these classes - theres no need :-)
Windows Sockets allow you to get data between networked computers whether the network is a LAN, WAN or the Internet.
Even the simplest examples I've seen elsewhere are too difficult for casual use,
and cause people problems when trying to use Sockets in separate Threads.
Windows Sockets can get as difficult as you'd like to make them... it took about two months to evolve these classes.
I've tried to see through the fog and create the simplest useful classes that I can.
Ultimately they send and receive data through CArchives. This is discussed later.
CSocketClient lets you create a Client Application.
That means that your Application talks and listens to exactly one other Computer which the user (or code) has to specify by IP address (or Computer Name) and TCP Port.
This may be a one to one, Peer to Peer arrangement, or your application may be a Client which is talking to a Server which is handling thousands of other Clients at the same time.
Each CSocketClient instance makes one Socket connection.
You'd normally use Send("Msg"); and Msg=GetReply(); to send and receive Messages.
CSocketServer and CConnection allow your Application to talk and listen to many other computers at the same time.
This is Server or full Peer to Peer behaviour.
CSocketServer listens for conections on a particular TCP Port.
It does this in a separate Thread, so your Application can carry on with other work.
When a Client requests a Connection, an instance of CConnection is created and added to a list of classes held by CSocketServer.
CConnection is derived from CSocketBase and behaves in a similar fashion to CSocketClient except that each instance of CConnection is running in a separate Thread.
This means that if one instance of CConnection hangs for some reason, the Server should continue to function.
- At this point I would like to mention I/O Completion Ports, scalability and resources.
- This code does not use I/O Completion Ports.
- If you are developing an application which will ultimately have thousands of connections then you need to use I/O Completion Ports and/or many servers.
- I/O Completion Ports use a pool of 50 or more Worker Threads...
- If you have a peak of fewer than 50 simultaneous Connections then this code (using one Thread per Connection) will use fewer System Resources and be far faster to code!
By default the Port is 14 which not a Standard Port (There shouldn't be anything else listening on this Port).
Ports over 1024 are usually free for new Applications.
Standard Ports are listed in your Windows\system32\drivers\etc\SERVICES file.
You may get a TRACE:"Warning - Attempt made to seek on a CSocketFile" from these routines: its just the CArchives closing down.
Be aware that if the socket uses an Internet Connection it is subject to the same restrictions as e-mails and Web Pages:
Binary data may be altered by Routers, so it should be sent using UU, Base64 or Quoted Printable encoding.
If you are sending text with a few strange characters, use Quoted Printable.
If you are sending Binary Data use Base64 or an equivalent of your own design (UU encoding is less efficient).
I've read that CArchive remembers every object it saves (so that it only saves a reference if it gets the same object twice)...
I'm not sure that this is true for CArchives linked to CSocketFiles...
But just incase, stop this behaviour (which is obviously wrong for a Socket transmission) by using the following instead of CArchive if you're sending Objects through CArchive:
class CSocketArchive : public CArchive {
public:
CSocketArchive(CFile* pFile, UINT nMode, int nBufSize=4096) : CArchive(pFile, nMode, nBufSize) {};
void ClearIndex() {
if(IsStoring()) {
if(m_nMapCount>1) {
m_pStoreMap->RemoveAll();
m_pStoreMap->SetAt(NULL,(void*)(DWORD)0);
m_nMapCount=1;
}
}else while(m_nMapCount-->1) m_pLoadArray->RemoveAt(m_nMapCount);
}
};
CSocketClient
This is the standard Client class.
It allows you to connect to a Port on a specific Server;
Call Send(...) to start a conversation;
When you expect a reply use GetReply():
CSocketClient Chat("127.0.0.1", 23);
Chat.Send("Hi");
if(Chat.GetReply()=="bye") Chat.Disconnect();
else Chat.Send("bye");
If the Server starts a conversation,
the overridable function OnReceive() is called
from where you may use GetReply() and Send(...) again.
Heres a Chat Server Client example:
Create a Dialog Application with a ListBox to print messages in, an EditBox to Type Messages in, and a [Send] Button.
The GetReply and Send overrides alter the default behaviour from sending CStrings to ordinary strings. Making the Client interchangable with a Telnet session.
#include "SocketInterface.h"
class CClientDlg : public CDialog {
...
public:
class CChatClient : public CSocketClient {
CClientDlg* Dlg;
void OnReceive() {Dlg->Say(GetReply());}
void OnClose () {Dlg->Disconnect();}
CString GetReply() {
CString S;
ArI->ReadString(S); // It'll wait here for as long as the connection is OK. If the connection is broken it returns an empty string.
return S;
}
public:
void Send(CString Msg) {ArO->WriteString(Msg+"\r\n"); ArO->Flush();}
CString Connect(CString RemoteIP, BYTE Port, CClientDlg* _Dlg) {Dlg=_Dlg; return CSocketClient::Connect(RemoteIP,Port);}
} Chat;
void Say(CString Msg) {if(::IsWindow(m_hWnd)) ((CListBox*)GetDlgItem(IDC_LIST1))->AddString(Msg);}
void Disconnect() {
Say("Disconnected");
GetDlgItem(IDC_Send)->EnableWindow(FALSE);
}
...
};
BOOL CClientDlg::OnInitDialog() {
CDialog::OnInitDialog();
CString S(Chat.Connect("127.0.0.1",23, this)); //Note that you can't connect to yourself: You'll need to be running a Server on this PC to get a connection on 127.0.0.1
if(!S.IsEmpty()) Say(S); // Report any Error Messages
return TRUE; // return TRUE unless you set the focus to a control
}
//When a Send Button is clicked, the text from an Edit Box (IDC_EDIT1) is sent.
void CClientDlg::OnSend() {
CString S;
GetDlgItemText(IDC_EDIT1, S);
Chat.Send(S);
}
CSocketServer
Start by deriving a class from CConnection and overriding the virtual functions to encapsulate the behaviour you want for one Connection.
Since Telnet is a standard application on Windows, we can make a Telnet Chat Server as an example.
The Server will accept connections from Telnet or from the client described above using CSocketClient.
Create a Dialog Application with a ListBox to print messages in, an EditBox to Type Messages in, and a [Broadcast] Button.
The Server needs a pointer to the Dialog Box to give textual output to be displayed in a ListBox.
Derive a class from CSocketServer called CChatServer which simply holds a pointer to the Dialog and provides a Say(...) function that will add text to a List Box:
#include "SocketInterface.h"
class CChat; // <- forward references
class CServerDlg;
class CChatServer : public CSocketServer {
CServerDlg* Dlg;
public:
CChatServer(BYTE Port, CServerDlg* Dlg) : CSocketServer(Port), Dlg(Dlg) {}
void Say(CString Msg);
protected:
virtual CConnection* New();
};
void CChatServer::Say(CString Msg) {Dlg->Say(Msg);}
CConnection* CChatServer::New() {return new CChat(this);}
class CServerDlg : public CDialog {
CChatServer* Chat;
public:
CServerDlg(CWnd* pParent=NULL) : CDialog(CServerDlg::IDD, pParent), Chat(0) {} // standard constructor
virtual ~CServerDlg();
void Say(CString Msg) {if(::IsWindow(m_hWnd)) ((CListBox*)GetDlgItem(IDC_LIST1))->AddString(Msg);}
...
};
After the Dialog class is the Handler class, derived from CConnection.
This provides all the functionality for one Chat Server Connection.
GetReply and Sender alter the default behaviour from sending CStrings to ordinary strings.
class CChat : public CConnection {
void Say (CString Msg) {((CChatServer*)GetServer())->Say(ID+Msg);}
void Broadcast(CString Msg) {Say(Msg); ((CChatServer*)GetServer())->Broadcast(ID+Msg);}
CString ID;
CString GetReply() {
CString S;
try {
ArI->ReadString(S); // It'll wait here for as long as the connection is OK. If the connection is broken it returns an empty string.
}catch(...) {return '¿'+sConnectionBroken;}
return S;
}
public:
CChat(CSocketServer* Me) : CConnection(Me) {}
protected:
void Sender(CString Msg) {ArO->WriteString(Msg+"\r\n"); ArO->Flush();}
void OnDisconnect() {Broadcast("Disconnected");}
void OnConnect() {
ID.Format("%s(%i): ", RemoteIP, m_hSocket);
Broadcast("Connected");
Send("You will appear as "+ID);
}
void OnReceive() {
CString S(GetReply());
if(*S=='¿') {Say(S.Mid(1)); return;}
if(!S.CompareNoCase("Bye")) Disconnect();
Broadcast(S);
}
};
Create the server in OnInitDialog, destroy it in the destructor and create a Button labeled [Broadcast] which sends some text in an Edit Box to all connections:
BOOL CServerDlg::OnInitDialog() {
CDialog::OnInitDialog();
Chat=new CChatServer(23,this); // C:\Windows\System32\drivers\etc\SERVICES file lists Telnet as TCP port 23
return TRUE; // return TRUE unless you set the focus to a control
}
CServerDlg::~CServerDlg() {delete Chat;}
void CServerDlg::OnBroadcast() {
CString S;
GetDlgItemText(IDC_EDIT1, S);
Say("Broadcasted: "+S);
Chat->Broadcast("Server: "+S);
}
The CConnection will make the CSocketServer hold a list of CChats.
If the Server didn't need to give any visual feedback (a if running as a Service for example)
you put the following in your Application class InitInstance():
#include "Telnet.h"
BOOL CMyApp::InitInstance() {
CSocketServer(23); // C:\Windows\System32\drivers\etc\SERVICES file lists Telnet as TCP port 23
...
}
So you will now have an application which runs a Server listening to TCP port 23 which will create a new instance of CChat for each new Connection.
When this application is running, you can open Telnet: [Start Menu][Programs][Accessories][Telnet][File Menu][Remote System...][Host Name]127.0.0.1[Connect]
Then type Hi and press return, then bye and press return to watch the session work.
Incedentally, 127.0.0.1 is the IP Address used to refer to the local computer regardless of the IP Addresses of any network cards it may have.
CArchive and CString:
Usually you should be able to just use CStrings to hold data using the << and >> operators which send and use the data length before the string data.
You can't use CArchive::ReadString and CArchive::WriteString to transfer data because they scan the data looking for '\n' and decide thats the end of the string; the data length is not sent.
You can run into confusion when holding data in CStrings if you use any functions (like fputs) which look for a Null Terminator instead of using the data Length.
It is possible to use the CArchive::Write or CArchive::WriteString but you must be careful in the Server because data is sent in a queue and direct accesses like this won't be using the queue...
Send appends the message to the Message Queue.
The Message Queue doesn't get processed until you return from OnReceive().
In this example:
void OnReceive() {
Send("Hello");
ArO << MyObject;
}
Hello will be sent _after_ MyObject!
Use Sender("Hello") or ArO << CString("Hello") << MyObject; to send Messages with objects.
You can download a complete Chat Server project here that includes a Client although it works with Telnet as a client too:
(Known Bug: Client crashes if started with no Server available)
Delta Compression
Minimal Data Transfer
CBlockDelta Remote File Get with minimal data transfer.
This class is for Client/Server File transfer across slow mediums.
It is intended for use in Backup/Archiving/Versioning Applications.
Any computer wanting a file from another computer simply calls CBlockDelta::Get(...) with appropriate arguments.
The minimal data transfer is a result of Delta Compression:
If the computer asking for the file already has a similar file (an older version),
the data transmission simply describes the changes needed to convert the old file into the new file.
In addition, the Data Blocks transmitted are LZW compressed if the system thinks its worth doing.
The Get and Put routines use CArchives as communication links; you would normally use CSocketFile based CArchives.
The Delta-Compression Process
Normally when comparing two files you compare the first Byte, then the next, and so on,
but over a slow medium that would mean copying the whole of the Remote file over anyway, for the comparison, so you need a way of comparing blocks of Bytes at a time.
Comparing blocks can be done with checksums.... MD5 would compare two blocks very accurately, but its quite slow.
CBlockDelta uses 512 Byte locks simply because I found that worked best and varying the block size makes the code ugly.
The spark that makes delta-compression fire is the "Rolling Checksum" which allows you to get the checksum of the first 512 Bytes (all fine and normal so far)
but then you can get the checksum of the 512 Bytes starting from Byte 1 simply by removing the first Byte (Byte[0]) and adding the next Byte (Byte[512]) with a little bit of maths (rather than re-visiting all the Bytes in the middle).
This ability to "roll" the checksum forward one Byte allows us to efficiently scan a file for a block that has a particular checksum, and find that block no matter what its displacement is.
Walkthrough:
One computer wants the latest version of a file that it already has an old version of.
So it creates an array of checksums, two entries for each 512 Byte block of the file that it has.
One array entry uses the Rolling Checksum alogrithm on the block (because its fast).
The other array entry contains the MD5 checksum for the block (because its accurate).
It sends this array of Checksums to a computer which has the more recent version of the file.
This computer opens the more recent version and looks at the first 512 Bytes, forming the Rolling Checksum and sees if that checksum exists in the Checksum array.
If it doesn't then it rolls one Byte forward in its file, by removing the first Byte from the rolling checksum, and adding the 513th Byte to the Rolling checksum.
It looks again for the new value in the Checksum Array.
Assume that it finds it, this time.
Since the Rolling Checksum isn't very accurate (different blocks may give the same value), the MD5 checksum is calculated and checked with the appropriate entry in the array (which is now a Hash Table of Rolling Checksums, each with an array of matching MD5 Checksums, for speed).
If we can find an entry with the same MD5 value, we assume we have the same block (true, we may not, but the chances are microscopic - the chances of two blocks having the same MD5 is small, but having the same MD5 AND the same rolling checksum? Perhaps someone with a supercomputer would like to find blocks that give matching checksums for me :-).
So that means that we have one new byte at the beginning of the file and a block that we already know.
CBlockDelta forms a CString which describes the differences (called Comparison) which is just a text file one line describes either a start and end index to data, or a block number or range of blocks.
So if the first Byte was all that was added, and we had 1000 block of data afterwards, the Comparison string would simply say:
1,1
0-999
- All numbers in the string are in Hexadecimal.
- The string contains lines of text separated with '\n' characters.
- A single number on a line indicates a Block from the older File.
- A Block is 512 Bytes except for the last Block which may be shorter (FileSums::GetEndBlockLength()).
- A pair of numbers separated with a '-' indicate a range of adjacent Blocks from the older File ([First Block]-[Last Block] inclusive).
- A pair of numbers separated with a ',' indicate a range of adjacent Bytes from the newer File ([File Position],[Bytes to read]).
- If the Files are equal you get a single line string: 0-FileSums::GetBlockCount() indicating that all the Blocks of the older File make the newer file.
- If the Files are totally different you get a single line string: 0,FileSums::FileLength indicating that all the Bytes of the newer File make the newer file.
MakeDelta:
The computer with the latest version of the file then uses the Comparison String to return a binary stream of data that contains blocks of compressed data that the other computer doesn't have, and block numbers that it does have.
UseDelta:
The computer with the old version of the file creates a new file using the blocks from either the data stream, or the old file according to the instructions in the data stream.
Example Usage:
Derive a class from CBlockDelta that will create CSocketFile based CArchives.
The following example is of a Dialog Box with a CBlockDelta class within it.
Create a dialog application with a List Box (IDC_LIST1) and make the dialog class include the following code:
#include <afxsock.h> // MFC socket extensions
#include "BlockDelta.h"
class CBlockDeltaTesterDlg : public CDialog {
public:
CBlockDeltaTesterDlg(CWnd* pParent = NULL); // standard constructor
void Say(CString Msg) {((CListBox*)GetDlgItem(IDC_LIST1))->AddString(Msg);}
class CRemoteFileFetcher : public CBlockDelta {
CBlockDeltaTesterDlg* Dlg;
static DWORD FAR PASCAL ThreadProc(LPVOID lpData) {((CRemoteFileFetcher*)lpData)->Listen(); return 0;}
HANDLE hThread;
DWORD ThreadID;
public:
CRemoteFileFetcher(CBlockDeltaTesterDlg* Dlg) : Dlg(Dlg) {hThread=CreateThread((LPSECURITY_ATTRIBUTES)NULL, 0, (LPTHREAD_START_ROUTINE)ThreadProc, (LPVOID)this, 0, &ThreadID);}
void Say(CString Msg) {Dlg->Say(Msg);}
virtual void Getting (CString Remote) {Say("<Getting "+Remote);} // Called when starting to Get a file
virtual void Putting (CString Local ) {Say(">Putting "+Local);} // Called when starting to Put a file
virtual void SetBytesToRead (DWORD Bytes) {CString S; S.Format("<%u Bytes to read" ,Bytes); Say(S);} // Called as soon as BytesToRead is known
virtual void SetBytesToWrite(DWORD Bytes) {CString S; S.Format(">%u Bytes to write",Bytes); Say(S);} // Called as soon as BytesToWrite is known
virtual void SetInProgress (DWORD Bytes) {CString S; S.Format("<%u Bytes in" ,Bytes); Say(S);} // Called whenever Bytes are received
virtual void SetOutProgress (DWORD Bytes) {CString S; S.Format(">%u Bytes out" ,Bytes); Say(S);} // Called whenever Bytes are sent
virtual void Received (char Compression) {CString S; S.Format("<%u%% compressed" ,Compression); Say(S);} // Called when finished Putting (Compression is %Compressed)
virtual void Sent (char Compression) {CString S; S.Format(">%u%% compressed" ,Compression); Say(S);} // Called when finished Getting (Compression is %Compressed)
void Listen() {
AfxSocketInit();
CSocket ServerSocket;
ServerSocket.Create(14);
ServerSocket.Listen();
for(;;) {
CSocket Socket;
ServerSocket.Accept(Socket);
CSocketFile File(&Socket);
CArchive ArI(&File, CArchive::load);
CArchive ArO(&File, CArchive::store);
Put(ArI,ArO);
} }
// Returns an error message or "" if everything worked.
CString Fetch(CString Remote, CString Local, CString Output, CString ServerIP) {
AfxSocketInit();
CSocket Socket;
if(!Socket.Create()
|| !Socket.Connect(ServerIP, 14)) return "Socket creation failed";
CSocketFile SocketFile(&Socket);
CArchive ArI(&SocketFile, CArchive::load);
CArchive ArO(&SocketFile, CArchive::store);
return Get(Remote,Local,Output, ArI,ArO);
}
};
... The rest of the dialog class
};
Make the OnInitDialog() look like this:
BOOL CBlockDeltaTesterDlg::OnInitDialog() {
CDialog::OnInitDialog();
CRemoteFileFetcher RemoteFileFetcher(this);
CString S;
S=RemoteFileFetcher.Fetch("Remote.txt","Local.txt","Output.txt", "127.0.0.1");
if(!S.IsEmpty()) AfxMessageBox(S); // could just return S if this is inside a Function returning a CString.
return TRUE; // return TRUE unless you set the focus to a control
}
CRemoteFileFetcher creates a listening Thread as soon as it is instantiated.
So if you run it on two computers, either computer may use the Fetch(...) function and the other will respond.
For a quick test on a single computer you just use the IP address "127.0.0.1", otherwise use two computers.
In the project's folder (the usual startup directory in Debug mode) copy the normal ReadMe.txt file to a file called Local.txt.
Then copy Local.txt to a file called Remote.txt.
Edit Local.txt with Notepad: select all the text and copy it, then paste it six times in succession.
Now you have two similar files and this dialog will report what is happening when they are compared.
If you were doing this on two computers, Remote.txt would be on the Server and Local.txt on the Client.
The line in the OnInitDialog():
S=RemoteFileFetcher.Fetch("Remote.txt","Local.txt","Output.txt", "127.0.0.1");
tells the class to Fetch Remote.txt using Local.txt (if it can) to create Output.txt on the Local(Client) computer.
The block size if fixed throughout the class as 512 Bytes (found by experiment to be optimum).
My ReadMe.txt file was split into 7 blocks and one 175 Byte block which is sent compressed to 109 Bytes.
Since the Blocks are only sent as numbers the zipped block was the only File data sent through the Sockets.
Taking all socket data into account, the file was effectively sent through the Sockets 89% compressed and 96% compressed if the Remote.txt is the one with six pastes.
CBlockDelta requires the following additional files:
Remote File Access
Remote DOS
This class provides a simple interface to provide access to a Remote Computer's files in the same way that FTP does.
The difference is that this uses Delta Compression for all file transfers.
Imagine that you have already downloaded a huge file from the server but that you know the file on the Server has been changed and you want the updated version.
This class will compare your Local File to the Remote File and only download the differences in the files.
This class is for Peer to Peer or Client/Server File transfer across slow mediums.
It is intended for use in Backup/Archiving Applications.
The Client/Server simply creates an instance of CRDOS which sets up a thread listening to Windows Socket 14.
Any computer wanting a file from another computer simply calls CRDOS::Get(...) with apropriate arguments.
The minimal data transfer is a result of Delta Compression:
If the computer asking for the file already has a similar file (an older version),
the data transmission simply describes the changes needed to convert the old file into the new file.
In addition, the Data Blocks transmitted are LZW compressed if the system thinks its worth doing.
Communication is by Windows Sockets, so the computers may be linked using the Internet.
The following functions are implemented:
| SetServer: | Tries to connect to the Remote Server. If successful, returns the the Remote Server's Current Directory. |
| CD: | Changes the Remote Server's Current Directory. |
| PWD: | Print Working Directory (the Remote Server's) |
| DIR: | Returns a CString containing a listing of the Files and Folders in the Remote Server's Current Directory.
Directories are first in the list and have '+' as the first character.
The list is sorted alphabetically. |
| Tree: | Makes the Server create a file describing the Directory tree from its Current Directory. (See CFO in DTree.h) |
| Find: | Returns a CFO of a Remote File (if found) |
| Fetch: | Copies a file from the Remote Server to a specified file on the local computer (like SaveAs) |
| Put: | Sends a file to the Remote Server. |
| Get: | Copies a file from the Remote Server (No name changing). |
| Delete: | Deletes a file on the Remote Server. |
If anyone thinks its worth me maintaining or updating any of the above, please e-mail me.