2013-03-15

Extracting WhatsApp message logs from a WhatsApp database

Just a small update.

After returning from my long trip abroad, having used WhatsApp profusely during the time there, I had lots of conversations I wanted to export to a message log and store for future, older days. ("Wow, I can't believe she wrote me THAT! ;)")

These chat logs included lots of media (images, audio clips and videos), which I also obviously wanted to keep. I used an Android phone, so I looked for an existing solution in the Google Play store. I found an applicaton called WhatsApp to Text which is quite nice, but fails to offer the option of exporting the media. I found no other solution in the Play store.

I then looked on-line for another solution. The team of D. Cortjens, A. Spruyt and W.F.C. Wieringa from the University of Amsterdam have published the results of a research project titled WhatsApp Database Encryption. Based on the results of this project, a Python script called WhatsApp Xtract was coded, to allow the generate of WhatsApp messages logs - this time with all media intact. Great.

Only thing was, some features weren't working correctly. Media wasn't automatically detected in some cases, the generated log files were humungous, the names of contacts were sometimes not displayed, there was no way to see all media files (like in the actual WhatsApp software), it supposedly was able to repair corrupt databases (and salvage what it can from them) but didn't really, and generally it didn't satisfy my requirements.

So, I updated it a bit:
v2.5 (updated by Alon Diamant - Mar 14, 2013)
- Improved encrypted Android database detection and decryption code
- Can now repair malformed Android databases (depends on availability of sqlite3 executable)

v2.4 (updated by Alon Diamant - Mar 06, 2013)
- Generates media index file - but crappily, we should set this up better..

v2.3 (updated by Alon Diamant - Mar 05, 2013)
- now generates separate file for each contact
- fixed file search to search for image files in days prior to date given (to fix a bug where because of timezone differences the image file exists but is not found)
- fixed message counts for contacts
- does not list contacts with 0 messages
- now writes version number in generated files
- (Android Version) displays WhatsApp name (server based) if no display name is found for a contact
- (Android Version) Supports Python 2.6
It still is medicore, and I am not happy with the way it works nor the way it is coded, but it'll serve for now. I do feel like coding an Android application to do this properly on the phone, with well formatted output files that include all media. We'll see.

For now, enjoy. You can also find updated in the project repository.

2009-10-29

CLI definition language

Hey,

I'm sure you've had the occasion on which you developed an interesting application, and now required the ability to control it from the command line. The standard C/C++ functions for implementing various switches and input is nothing fun to work with, and the various template and composition based solutions are too complex most of the time.

Check out this interesting project:
http://www.codesynthesis.com/projects/cli/

You define the options you want in a very, very simple definition language, compile it into a C++ class file and voila: you can use it in your main() function.

Nice.

2009-10-28

Catastrophic failure (Exception from HRESULT : 0x8000FFFF (E_UNEXPECTED))

Are you trying to use an ActiveX (OCX) control in a .NET application?

Did you add the reference, create a class and are quite happy at how easy it was?

So, now you try and run the application and CRASH - Catastrophic failure!!!

The problem is that an ActiveX control requires special handling when embedded into a .NET application. The easiest way to do this is:

1. When in the Form Designer, open the Toolbox window and right click one of the tabs.
2. Click "Choose Items..." menu option
3. Select the "COM Components" tab
4. Find your control in the list and add it.
5. Drag the control to your form.

This will create various new Interop classes, including special event handlers for your events, et cetera. It should work now.

Hurrah!!

2009-10-22

map/set iterators incompatible

So, you've been working on some piece of code that employs STL containers, specifically a map. Perhaps you've also added some multi-threaded flavoring to pack a punch. Great.

Suddenly, you see this:

Debug Assertion Failed!

Program: C:\....\Test.exe
File: C:\Program Files\Microsoft Visual Studio 8\VC\include\xtree
Line: 293

Expression: map/set iterators incompatible
Strange. What just happened? Did you just mix the iterators of different classes together? No, you're certain you used map iterators, not set iterators. Everything SEEMS well...

Ah, that's right. You have an iterator that points to the map's items, and are indeed iterating over them. The map is protected by a mutex, but - so as to not block for too long - you release and reacquire the mutex every iteration. Another thread then executes and clear()'s the map, rendering your iterator useless. You now compare it to the end() iterator, and voila - crash!

The solution? Make sure you don't invalidate your iterators while using them. That means - don't modify the STL container from another thread (even if you are using synchronization mechanisms) if you're not absolutely certain that you can and that you will not invalidate existing iterators. Operations like insert, erase and clear can invalidate iterators - read the details at each container's documentation page.

2009-08-16

Winsock error 10049 with no apparent reason

Hey all,

So, it's been a while since I last updated. I assume I am experiencing the way too common "early enthusiasm leads to early laziness" syndrome.

Without further a do - a lovely issue I had to face.

On certain machines - all Vista SP1 machines, we thought - certain outgoing TCP connections to well known HTTP servers (www.google.com, www.facebook.com, www.microsoft.com etc) simply failed.

The failure returned a last error code 10049 (0x2741) which means "The requested address is not valid in its context". This usually happens when someone tries to use a non-existent local IP address as the socket address, or when someone tries to connect to an invalid remote computer.

I wrote a test program and reproduced the behavior. I discovered that, oddly, using ADDR_ANY works perfectly - but binding to a specific interface doesn't work.

The test program seemed to work everywhere, including Vista SP2 machines. I found no mention of this behavior anywhere on-line. I disabled the Windows Firewall and the on-board ESET NOD32 3.0 Business anti-virus, and double checked that everything was turned off. It still failed. I was stumped.

I posted on alt.programming.winsock asking about this:



Hello all,

I have been witnessing some very strange behavior of Winsock on
Windows Vista SP1, and would like to share my findings to see if
anyone could help me figure out the answer to what's wrong.

I've been running a simple program which tests the behavior of Winsock
when it comes to binding a UDP socket, binding a TCP socket and
binding and then connecting a TCP socket with a wildcard interface and
a specific interface.

On my Windows XP SP3, running as an administrator, I get the following
results:

----------------------------------------------------------------------------------------

Testing bind() UDP with 0.0.0.0:0...
+++ Success!!!
Testing bind() UDP with 0.0.0.0:1024...
+++ Success!!!
Testing bind() UDP with 0.0.0.0:32033...
+++ Success!!!
Testing bind() UDP with 0.0.0.0:55301...
+++ Success!!!
Testing bind() UDP with 192.168.2.110:0...
+++ Success!!!
Testing bind() UDP with 192.168.2.110:1024...
+++ Success!!!
Testing bind() UDP with 192.168.2.110:32033...
+++ Success!!!
Testing bind() UDP with 192.168.2.110:55301...
+++ Success!!!
Testing bind() TCP with 0.0.0.0:0...
+++ Success!!!
Testing bind() TCP with 0.0.0.0:1024...
+++ Success!!!
Testing bind() TCP with 0.0.0.0:32033...
+++ Success!!!
Testing bind() TCP with 0.0.0.0:55301...
+++ Success!!!
Testing bind() TCP with 192.168.2.110:0...
+++ Success!!!
Testing bind() TCP with 192.168.2.110:1024...
+++ Success!!!
Testing bind() TCP with 192.168.2.110:32033...
+++ Success!!!
Testing bind() TCP with 192.168.2.110:55301...
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:0...
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:1024...
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:32033...
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:55301...
+++ Success!!!
Testing bind() AND connect() TCP with 192.168.2.110:0...
+++ Success!!!
Testing bind() AND connect() TCP with 192.168.2.110:1024...
+++ Success!!!
Testing bind() AND connect() TCP with 192.168.2.110:32033...
+++ Success!!!
Testing bind() AND connect() TCP with 192.168.2.110:55301...
+++ Success!!!
Press any key to continue . . .

----------------------------------------------------------------------------------------

The same results appear in Windows Vista SP2, being run as an
Administrator and as a standard user. Windows firewall is disabled, or
an exception is added, in all cases.

Windows Vista SP1, with firewall disabled, eSET NOD32 antivirus
disabled and no other special software I can think of gives me this
result, however:

----------------------------------------------------------------------------------------

Testing bind() UDP with 0.0.0.0:0...
+++ Success!!!
Testing bind() UDP with 0.0.0.0:1024...
+++ Success!!!
Testing bind() UDP with 0.0.0.0:32033...
+++ Success!!!
Testing bind() UDP with 0.0.0.0:55301...
+++ Success!!!
Testing bind() UDP with 192.168.2.24:0...
+++ Success!!!
Testing bind() UDP with 192.168.2.24:1024...
+++ Success!!!
Testing bind() UDP with 192.168.2.24:32033...
+++ Success!!!
Testing bind() UDP with 192.168.2.24:55301...
+++ Success!!!
Testing bind() TCP with 0.0.0.0:0...
+++ Success!!!
Testing bind() TCP with 0.0.0.0:1024...
+++ Success!!!
Testing bind() TCP with 0.0.0.0:32033...
+++ Success!!!
Testing bind() TCP with 0.0.0.0:55301...
+++ Success!!!
Testing bind() TCP with 192.168.2.24:0...
+++ Success!!!
Testing bind() TCP with 192.168.2.24:1024...
+++ Success!!!
Testing bind() TCP with 192.168.2.24:32033...
+++ Success!!!
Testing bind() TCP with 192.168.2.24:55301...
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:0...
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:1024...
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:32033...
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:55301...
+++ Success!!!
Testing bind() AND connect() TCP with 192.168.2.24:0...
--- connect() failed: 10049 (The requested address is not valid in its
context.).
Testing bind() AND connect() TCP with 192.168.2.24:1024...
--- connect() failed: 10049 (The requested address is not valid in its
context.).
Testing bind() AND connect() TCP with 192.168.2.24:32033...
--- connect() failed: 10049 (The requested address is not valid in its
context.).
Testing bind() AND connect() TCP with 192.168.2.24:55301...
--- connect() failed: 10049 (The requested address is not valid in its
context.).

----------------------------------------------------------------------------------------

I can't seem to understand why this happens only on Windows Vista SP1,
with a specific *CORRECT* interface.

The program code is attached after this post.

Does anyone have any clue as to what is causing this?

Thanks,

Alon




I wrote an ICMP ping client and discovered that ping worked properly on all machines - including Vista SP1. Strange - is this something to do with specific ports?

I received a response in the newsgroups - something about Vista SP1's anti-malware logic. I was so confused at this point that I actually started looking this up, but naturally - I found nothing.

Then I discovered that a clean Vista SP1 does not exhibit this issue at all - it worked properly.

I started searching for TCP/IP stack hooks, or for NDIS filter drivers - I was looking everywhere for answer, really.

I found nothing!

My final move was to recheck the firewall and anti-virus. I was amazed to discover that shutting down the ESET NOD32 anti-virus and disabling all active protection doesn't disable the HTTP filtering module.

This module, which basically blocks all outgoing connections from or to ports 80, 3128 and 8080 from all non-approved "browser" applications, was the cause of all my problems!

I discovered this issue in only one other place on-line, at yoics.

ESET themselves explain this on their own page, but they don't say anything about what behavior a programmer should expect to see if their HTTP filter is active.

Basically - you can can create a socket, you can bind(), but if you try and connect to an address with port 80/3128/8080 using a socket bound to a specific interface - Winsock error 10049 is thrown. This is as if the address is invalid. Confusing!

Right now I'm looking at anti-virus/firewall detection possibilities - including WMI. This might help us notify our users as to what actions they must take to unblock our application with their anti-virus "flavor".

I hope this post will help others, somewhere, somehow.

2009-06-15

Proper usage of shared_ptr

shared_ptr is a very helpful class. It allows a developer to basically forget about dynamic memory deallocation. Everything is taken care of by the C++0x (or, right now, C++ TR1) library. shared_ptr objects can be used inside STL containers, as opposed to other smart pointer classes such as auto_ptr. Deallocation occurs automatically when the dynamically allocated resource is no longer referenced - this is a great answer to the "caller/callee deallocation responsibility" issue.

In my short amount of experience with shared_ptr objects, I've found some common pitfalls and best practices, so here goes.
  1. Pass shared_ptr objects as parameter by reference whenever possible. Why? Because passing the shared_ptr object by value will cause a copy of the object to be created, using a copy constructor. This is not a very light operation: it creates a new shared_ptr object and sets its internal pointer to the copied shared_ptr's internal pointer, and it increments the reference count in the control block shared by all shared_ptr that point to this internal pointer. This copy construction takes 10-20 times longer than simply passing a raw pointer - and, worst of all, it does not have any added benefits. The copied shared_ptr is destroyed as soon as the scope is exited. In an application that requires many function calls with shared_ptr parameters, this can seriously hamper performance. Passing a shared_ptr object by reference, however, is equivalent to passing a raw pointer. The downside is that implicit casting is not supported when passing a reference.

  2. Don't use shared_ptr objects in performance oriented algorithms. Don't use shared_ptr objects in code that encodes or decodes a buffer, for example. It makes no sense. If the buffer that is being encoded/decoded is passed as a shared_ptr - great. Take the internal pointer using the get() method and code the encoding/decoding using raw pointers. Since the shared_ptr is still in scope - the dynamically allocated object will not be deallocated. Using raw pointers instead of shared_ptr's as the basis for such encoding/decoding can prevent many, many occurences of dereferencing the shared_ptr. In our code, an experience coder made this exact mistake of using the shared_ptr for an encoder/decoder, and discovered that it operated over 20 times slower!

  3. Don't use shared_ptr objects for private data members, unless they are passed in via the constructor or setter methods. I don't think there is any need explaining why this is pointless - if a data member is constructed internally and is used internally, it can be deleted internally.

  4. Use enable_shared_from_this<> when necessary, but remember that shared_from_this() only works when you already have a shared_ptr somewhere in the process. If you call shared_from_this() on an object that was dynamically created somewhere, but does not already have a shared_ptr object pointing to it - you'll see the following exception:

    std::tr1::bad_weak_ptr at memory location 0xDEADBEEF

    or perhaps:

    std::bad_weak_ptr at memory location 0xDEADBEEF

    This is because shared_from_this() actually searches for a shared_ptr control block (which, as we stated above, is shared for all shared_ptr objects and contains the reference counter) inside its internal data structures - and if it does not find a control block (as there is no shared_ptr object at all), it throws this exception. It goes without saying that a shared_ptr to an automatic (stack allocated) variable is as bad as delete'ing such a variable.

ACE_High_Res_Timer runs too slow update

I submitted a patch to the ACE repository.

Hopefully this will be fixed in future versions of ACE, but I'm leaving the previous post intact - for future reference.