Tuesday, December 6, 2011

PDF Metadata Extraction - Multiple Files

This is going to be just a quick, short post (hey, don't laugh - it *can* happen!) with something I wanted to pass along to all my fearless readers.

Here's the scenario: I was stuck in Windows, and had a virtual ton of PDF files from which I need to extract metadata. No fancy commercial tools such as EnCase were at my disposal to automate the task for me, so I turned to pdfinfo. For those who are not familiar with it, pdfinfo is part of xpdf, an open source PDF viewer utility. PDF file metadata (author, title, revision, etc) is primarily stored in a couple different places within a PDF - the Info Dictionary, and/or the XMP (eXtensible Metadata Platform) stream. pdfinfo (which is a free utility, by the way) will extract this metadata from within a PDF file. It's a command-line utility, which is fine by me.

I had already located and exported the PDF files in question out to a single directory for parsing, and I was hoping it'd be as quick and easy as pointing pdfinfo to that directory and redirecting output to a file of my choosing. Alas, that was not to be; the tool is designed to be run like

pdfinfo.exe file.pdf

which would give STDOUT (or could be redirected to a text file, for instance). I tried against a single file, and that worked fine. I tried to use my limited Windows CLI knowledge and get it to feed the PDFs to pdfinfo, with no joy. If I was in Linux, I would've been more comfortable with creating a loop to go through the files and feed a variable (ie, the file) to pdfinfo. I messed around with looping in Windows a bit, but - another piece of the scenario - is that time was limited (of course!). In the process of trying to work out the loops, I looked at some posts on Commandline Kung Fu and other similar (well, similar, but less awesome, no doubt) sites. I may have had some syntax error or other minor issue that caused trouble, but I couldn't ever seem to get a loop to work, and just didn't have time to keep at it.

So here's my solution: I ran a quick file list for that directory, and used that in a spreadsheet to build out one line per PDF file, to parse that file's metadata and output to a plain text file (it's amazing what a little =concatenate, find/replace, and merge functions can do). I copied that over to notepad++ and saved it as a batch (cmd) file. Then I just fired off the batch file, let it run through and give me the metadata I was looking for. Not pretty, not the way that the Masters over at Commandline Kung Fu would have done it, but it got the job done. Here's an example, sanitized for public consumption.


pdfinfo.exe "t:\output\xyz001_pdf_export\United 01.pdf" >> t:\output\xyz001_pdf_export\pdf_metadata.txt
pdfinfo.exe "t:\output\xyz001_pdf_export\Carpet 02.pdf" >> t:\output\xyz001_pdf_export\pdf_metadata.txt
pdfinfo.exe "t:\output\xyz001_pdf_export\Tree 03.pdf" >> t:\output\xyz001_pdf_export\pdf_metadata.txt
pdfinfo.exe "t:\output\xyz001_pdf_export\Interview 04.pdf" >> t:\output\xyz001_pdf_export\pdf_metadata.txt
pdfinfo.exe "t:\output\xyz001_pdf_export\Local 05.pdf" >> t:\output\xyz001_pdf_export\pdf_metadata.txt
pdfinfo.exe "t:\output\xyz001_pdf_export\TipTop 06.pdf" >> t:\output\xyz001_pdf_export\pdf_metadata.txt
pdfinfo.exe "t:\output\xyz001_pdf_export\Safety 07.pdf" >> t:\output\xyz001_pdf_export\pdf_metadata.txt
pdfinfo.exe "t:\output\xyz001_pdf_export\Teleport 08.pdf" >> t:\output\xyz001_pdf_export\pdf_metadata.txt
pdfinfo.exe "t:\output\xyz001_pdf_export\Sharp 09.pdf" >> t:\output\xyz001_pdf_export\pdf_metadata.txt
pdfinfo.exe "t:\output\xyz001_pdf_export\Water 10.pdf" >> t:\output\xyz001_pdf_export\pdf_metadata.txt


So there it is, a short post (perhaps my first?). Hopefully it's helpful to someone else who needs to extract metadata from PDF files.

-----------------------------------

Just a quick update. In discussions last night on twitter, I mentioned that I thought Phil Harvey's exiftool would process PDFs for metadata. Rob Lee confirmed this, and called exiftool "the bomb-diggity." :) I said I would test it to compare against pdfinfo.

The two applications provide similar information; certainly the core info is the same (such as creation dates, permissions, author). pdfinfo provides information above and beyond exiftool, though, such as encryption, page size (actual dimensions), tags, form. Before you go thinking that pdfinfo is the way to go, I'll say that I find exiftool's output easier to read; each file entry is clearly separated, and the layout/format is nice (to me). BTW, pdfinfo also reports the filename based on the internal "Title." This can be confusing if the two don't match up. Exiftool reports the filename as seen by the filesystem/user, and the Title per the metadata.

Exiftool also gets you past the need to do any scripting, loops, etc. That's because you can run it like this:


"exiftool(-k).exe" -P t:\output\exports\desktop_pdf_export\*.pdf >> t:\output\exports\desktop_pdf_export\pdf_metadata_3.txt


And it's much faster. So while I still think pdfinfo is a great tool, I'm leaning toward Rob Lee's "bomb-diggity" direction on exiftool. ;) If I'd thought of that for PDFs, I'd probably never have seen pdfinfo, so it's a good thing I got to try both out. I think both are good, both are useful, and I'd use both again, certainly for cross-validation.

So there's the quick update. Enjoy!

Monday, November 7, 2011

BSidesDFW Follow-up

BSidesDFW_2011 - My Thoughts
Saturday, November 5th.
Check out the website - speakers, planners, sponsoring vendors, etc.

I arrived late to the fun as my daughter had a soccer game early in the morning; I deemed it a good idea to go to that first, so that delayed my start. Then poor choice of routes, road construction (thus the reference to route choice) and heavy traffic on side roads (see road construction), further delayed me, and I pulled up to the Microsoft Technology Center in the wonderful mood that traffic/road issues helps me to find when I'm trying to get somewhere. Yes, folks, that's sarcasm. Drives me nuts, truth be told.

Anyway, I'm an adult, and it's not anyone's fault, so I pulled myself together and went in. I was greeted warmly at the front desk, and situated with my raffle ticket, drink tickets for the after party, and given my APT (Advanced Persistent Texans) t-shirt. Shortly thereafter I had the opportunity to meet Michelle Klinger, the main organizer, and everyone else involved in putting the event together. Great bunch of folks.

Since I was running late, I missed Lodovico Marziale's talk on Registry Decoder. That was a major bummer. I really wanted to learn more about it, and ways to use it, straight from the folks that made it. But I wiped away my tears, and headed upstairs to Michael Gough's talk on "The BIG ONE!!!"

This was an interesting talk, to put it simply. Michael made some very salient points about needing a PLAN, needing to educate top-level management on their role, and train them on what happens in a breach (both good and bad). A big emphasis should be placed on not pointing the finger or - even worse - getting rid of InfoSec personnel when a breach occurs. It's typically seen that InfoSec is to blame for the breach, where in reality it truly is a shared responsibility by different parts of the business. It's important - for a number of reasons - to give the InfoSec team the time and resources to address and remediate the issue. Far too often, we're blamed, and key personnel are removed (aka, fired/terminated/expunged/beheaded - ok, maybe not that last one); this really doesn't help and in fact causes more problems (such as voluntary departures by additional people, public sabotage, and other ongoing problems not directly related).

Another key takeaway was the need for a PLAN (I'm thinking flow charts and everything, maybe even swimlanes! ;)) As Michael described it, if X happens, we will do Y; he related this to plans at a former employer, back when Slammer hit. They actually shut down their internet connection on a Thursday, and didn't enable it again until Monday. That was the plan, and they did it. It cost them a ton of money, but saved the company 3 tons of money (interpretive paraphrase, but you get the point); some people didn't believe they would do it, and gave flack when they did, but in the long run it was worth it.

That was right before lunch, so once I made it through the food line (mmm, BBQ OR mmm, pizza) - hey, as a side note, the fine folks running the event had gotten hooked up with some local beer from McKinney, so anyone interested was able to have a tasty brew as well - I went looking for faces I might know. Sure enough, I saw Kyle Maxwell. He introduced me to a friend of his, Chris Gathright. After a good lunch, there was a raffle drawing, and prizes were given (just not to me).

Kyle and I hung out and talked until Andrew Case's talk on Data Exfiltration. I had to decide between Andrew's talk and Branden Williams' talk on the Anatomy of an Advanced Attack; Andrew's won out. Kyle and I were the only DFIR types in there, and Kyle had been the only one in Lodovico's presentation, but we expected that. For me, most of what Andrew brought up was just a review of information, as it was on host-based forensics (I was hoping for some network exfiltration after a breach, but it wasn't based on that).

However, he did some very cool stuff that I've not done before. He used scalpel to index the image looking for a "header" of a website URL and identify disk offsets. He then used Sleuthkit tools to map between the disk offset and file system, to find what files those existed in; turns out, pagefile had numerous hits on gmail indices. So, he DD'd out sections of the pagefile, and ran scalpel against those with a custom file fignature; this allowed him to successfully carve out multiple emails that were of interest and relevance. He also used Restore Points to help map out USB history; since he had RPs containing setupapi.log and registry files, he was able to pull usage history on almost a per-use basis, to show how many times several devices were used, and when. Now that's cool! Plus he mentioned a "setupapi extractor tool" that I need to find; I've always gone through setupapi.log with Notepad++ which worked quite well, but I'm always up for some new tool to make my job easier.

I wasn't sure which talk to attend next, but I was in the Track 1 room, and Michael Gough had another talk scheduled there, about Hacking a CardKey system; Ian Robertson was part of this as well. Sounded interesting, so I stuck around (Kyle went to sit in on the lightning talks); I'm glad I did, as it was interesting, scary, and informative. So as the story goes, "Peggy" (you know, from the commercials) was poking around on the internet and found some open ports (that didn't seem like they should be open), and was able to connect to them using some protocols that should didn't seem like should be allowed. Hmmm. "Peggy" was interested, and so set about finding out what was going on. Turns out, these were on cardkey systems, and they were infinitely pwnable. In the course of the research, "Peggy" and friends were able to build a mobile app that would unlock these systems (or the doors/gates they secured) at will. Ouch. "Peggy" reported the findings to the appropriate parties, and fortunately did not end up in jail. Whew!

By working with vendors, "Peggy" and friends have been able to help get some changes made that will at least provide the option of AES encryption. Just a side note, never assume you know who's at these things, or that they're one type of people/experience - I was surprised when someone asked what AES was, and why they didn't just use an encrypted password that couldn't be broken; the questioner seemed to have some other very technical knowledge, but it was apparently in a different area that I expected. Anyway, the crux of the biscuit is that these systems are STILL very vulnerable, and if you have any, make darn sure they're not on the internet, or upgrade the ethernet module so that AES is an option (then make sure to enable and configure it). There are still concerns, but at least that's a big help. By the way, I wasn't in on it, but Michael gave a lightning talk about Yubikey usage, and was giving away some free upgrades to LastPass Premium in the cardkey talk. A lot of folks also received Yubikeys, as Yubico was a sponsor. LastPass and Yubikey is a good combo.

The keynote was by Martin McKeay, giving a thought-provoking talk on fundamental flaws in Information Security. This wasn't a technical talk, which he stated up front. It was still very good, though. Don't want that to sound wrong, with the "though" in there. I think folks kind of expect to get down into the nitty gritty at these conferences, and Martin acknowledged that. So I'll put it this way - technical or not, it was a good talk.

My key takeaway was that, as an industry (with a career path) we're very young; only 23 years old. Firefighters, which we're often compared to (and Martin did as well), have centuries of experience, science, and testing behind them. Granted, their knowledge is changing, but they have a strong foundation and a long history. By and large, they KNOW what a fire will do. However, our landscape is changing on an almost-daily basis, our forefathers/frontrunners discovered and made stuff up on the fly, and we're largely continuing in that vein. We need to KNOW infosec, and if what we're doing works. We lack solid metrics, statistics, and facts. Martin pointed to the Verizon Data Breach reports as the best we, as an industry, have, but they really present a small cross-section of what's happened. Same for the Verizon PCI report. I feel kind of like Number 5, saying, "Need more input."

There was an after-party, but I did not stay for that. For whatever reason, I was really feeling tired (maybe being in a Microsoft building all day...), and it being a weekend, spending time with my family is important to me, so I headed on the house. I enjoyed the event, I think it was well-done, fun, great speakers, good swag, and best of all - free. I'm definitely looking forward to next year's.

Sunday, October 9, 2011

Artifacts Created by Nmap/Zenmap

The scenario is that I knew a particular system had run an Nmap scan against another particular system, using the Zenmap GUI (and was manually terminated before completion). I know, I know, why wasn't CLI used? Well, it just wasn't, and that's not really part of the problem. The problem was that after the fact, there was a need to correlate the scan against certain event log entries on the target, as it appeared that perhaps the scan had not shut down properly, and continued running in the background.

So how to do it? Does Nmap/Zenmap create any logs or other artifacts on the source system, which remain afterwards? So I put on my forensicator hat, gloved up (technical term for putting your protective gloves on), then announced, "I'm goin' in!" For testing, I ran Zenmap versions 4.6 (because I had it) and 5.1 (which was the current version in question) on XP Pro and Win7 Pro, both 32-bit, both fully patched. I ran scans both from the GUI and CLI (yes, I went ahead and tested that too, as I thought it would be good to know, although not pertinent to the current scenario). The ultimate prize was to be able to show what scan ran when, for how long, against what target (specifically the scan that was terminated early).

I'll also state that I wasn't trying to prove that Nmap/Zenmap was run, so I didn't look to prefetch, user assist, or other similar artifacts. I was trying to prove what was done when the application did run, more so than if it had been run or by whom. Think of a scenario where it might be legitimate for a user to use the application as part of their normal duties, and the question is whether they did something with it they should not have done. "What" and "When" became more important questions than "Who," "How" or "Why."

I'd love to be able to say that I reached the goal, but I'm afraid that's not the case. I found good stuff, but nothing directly showing the terminated scan. Ultimately I did find evidence of that scan in pagefile, but nothing that gave specific time frames, or the options used. What was useful was that I saw the NSE (Nmap Scripting Engine) info, which showed some details about the scan; some of these were particulars that were useful to me (some of this was similar to some of the data from temp files that you'll read about later). So in a round-about way I accomplished the mission, just not in the neat, clear-cut way I had hoped for. But even in that respect, all was not lost...

Results of testing were inconsistent. But wait, I thought you said all was not lost?! Right, it's not. Results were inconsistent, but there is stuff to look for, and I can provide some direction on that, so in case you need to know about Nmap/Zenmap artifacts, you'll have something to work with ahead of time. So, inconsistent results, what does that mean? It means that on either OS, with either version of Zenmap, some things were logged, some artifacts were created, and sometimes they weren't. That's what I mean by inconsistent. Seems like it meets the definition of the word to me. Anyhow, artifacts are created, and I will expand on that, so you'll know the kinds of things to look for.

Basically, there were three areas containing artifacts; two under the user profile, and one in the program directory. Keep in mind that sometimes some of these were present, and sometimes they were not; sometimes they had data and sometimes they did not. I'm just going to explain what I saw, when it was there to see, YMMV. Note: I did not see any of these artifacts when running Nmap from CLI; only when using the Zenmap GUI front-end. Due to the otherwise inconsistent results, I don't know that CLI doesn't create some of these artifacts as well (such as the temp files you'll read about), just that in my testing, I did not see it occur.

In c:\program files\nmap\zenmap\ a file was created when a scan was saved. This had the same user-selected name as the saved scan, with the extension USR. So if the scan saved was "test" then the subsequent file would be "test.usr." If you find one of these, you can bet the user saved a scan; this file should be identical to that. It is an XML file that has all the information about the scan, it starts out looking something like this:


nmaprun profile="nmap -v %s" scanner="nmap" hint="" scan_name="" args="nmap -v xxx.xxx.xxx.2" profile_name="Regular Scan" startstr="October 1, 2011 - 16:54" options="Verbose" start="1317506087" nmap_output=" Starting Nmap 4.60 ( http://insecure.org ) at 2011-10-01 16:54 Central Daylight Time Initiating ARP Ping Scan at 16:54 Scanning xxx.xxx.xxx.2 [1 port] Completed ARP Ping Scan at 16:54, 0.48s elapsed (1 total hosts) Initiating Parallel DNS resolution of 1 host. at 16:54 Completed Parallel DNS resolution of 1 host. at 16:54, 0.00s elapsed Initiating SYN Stealth Scan at 16:54 Scanning System (xxx.xxx.xxx.2) [1715 ports] Completed SYN Stealth Scan at 16:55, 39.01s elapsed (1715 total ports) Host System (xxx.xxx.xxx.2) appears to be up ... good. All 1715 scanned ports on System (xxx.xxx.xxx.xxx) are filtered MAC Address: xx:xx:xx:xx:xx:xx (make) Read data files from: C:\Program Files\Nmap Nmap done: 1 IP address (1 host up) scanned in 39.847 seconds Raw packets sent: 3431 (150.962KB) | Rcvd: 1 (42B) " version="4.60" target="xxx.xxx.xxx.2" annotation="" description=""


There was also a "zenmap.exe.log" file under Program Files, but it was not helpful for this purpose. It appears to be an error entry related to the application itself, not relating to activity. This might be helpful to show that Zenmap was run at some point, if that was a goal, but not for showing what scan was run or when.

In %User%\.zenmap (hidden folder) there are primarily three files of interest: recent_scans.txt, target_list.txt and zenmap.db. Recent_scans.txt is a list of saved scans (or perhaps the .USR instance, it's inconclusive at this point); all it has is a list of files with their paths. Target_list.txt is a list of all target IP addresses, separated by semicolons; it has no other information, not even an associated date. Zenmap.db is the fun one; it's a SQLite database that contains a history of what scans were run - type of scan, target IP, XML output (ie, basic scan detail) and time. In my case, the killed scan was not in there, but others were.

In %User%\%Local%\Temp has another potential treasure trove of evidence. I found temporary files (with no extension) located at this level. Some of contained no data, some contained only a small amount, and others looked like this (at the start):


Winpcap present, dynamic linked to: WinPcap version 4.0.2 (packet.dll version 4.0.0.1040), based on libpcap version 0.9.5

Starting Nmap 4.60 ( http://insecure.org ) at 2011-10-01 16:34 Central Daylight Time
--------------- Timing report ---------------
hostgroups: min 1, max 100000
rtt-timeouts: init 500, min 100, max 1250
max-scan-delay: TCP 10, UDP 1000
parallelism: min 0, max 0
max-retries: 6, host-timeout: 0
---------------------------------------------
Initiating ARP Ping Scan at 16:34
Scanning xxx.xxx.xxx.1 [1 port]
Packet capture filter (device eth1): arp and ether dst host 00:11:25:D1:04:E0
SENT (1.3820s) ARP who-has xxx.xxx.xxx.1 tell xxx.xxx.xxx.2
RCVD (1.3820s) ARP reply xxx.xxx.xxx.1 is-at xx:xx:xx:xx:xx:xx
Completed ARP Ping Scan at 16:34, 0.80s elapsed (1 total hosts)
Initiating SYN Stealth Scan at 16:34
Scanning xxx.xxx.xxx.1 [1715 ports]
Packet capture filter (device eth1): dst host xxx.xxx.xxx.2 and (icmp or (tcp and (src host xxx.xxx.xxx.1)))
SENT (1.3920s) TCP xxx.xxx.xxx.2:49151 > xxx.xxx.xxx.1:25 S ttl=40 id=15841 iplen=44 seq=237510861 win.14
SENT (1.3920s) TCP xxx.xxx.xxx.2:49151 > xxx.xxx.xxx.1:554 S ttl=55 id=24142 iplen=44 seq=237510861 win=4096
SENT (1.3920s) TCP xxx.xxx.xxx.2:49151 > xxx.xxx.xxx.1:3389 S ttl=47 id=2030 iplen=44 seq=237510861 win=4096
SENT (1.3920s) TCP xxx.xxx.xxx.2:49151 > xxx.xxx.xxx.1:389 S ttl=51 id=32698 iplen=44 seq=237510861 win=4096
SENT (1.3920s) TCP xxx.xxx.xxx.2:49151 > xxx.xxx.xxx.1:256 S ttl=46 id=12578 iplen=44 seq=237510861 win=3072
SENT (1.3920s) TCP xxx.xxx.xxx.2:49151 > xxx.xxx.xxx.1:113 S ttl=54 id=21527 iplen=44 seq=237510861 win=3072


Basically this is a detailed breakdown of the scan, really the veritable motherlode, as it shows the time of the scan, each target port, protocol, scan times, and so on. Very good stuff, when present. The temporary files that had only a little content basically mirrored the type of content in the USR files, so if you don't have one, you might have the other and still have some insight into the scan. Note: All these temp filenames were 9 characters in length, and started with "tmp."

And a slightly tangential question posed on twitter was how to identify a scan with packets. That's probably already documented, but since I was in testing mode, I looked into that myself as well. Fairly simple, right - just start Wireshark, run an Nmap scan, and review the results. Turns out across multiple types of scans run, that there are 60-byte packets, and all have the following content: 00 0d 60 da b4 e7 00 11 25 d1 04 e0 08 00 45 00. Obviously, that's not all the content, but that is what I saw as consistent across all packets captured. So there you go.

I think that's about it for this post. Hopefully if you're doing an investigation wherein the use of Nmap/Zenmap is key, this will help get you started. As always, happy forensicating!

Monday, September 5, 2011

Keep Track of File Acquisition with ProcMon

This post was inspired by Lenny Zeltser's recent blogs about using NirSoft Utilities in malware analysis.

Have I said that guy rocks? If not, let it be said. He rocks.

In having to do acquisition of loose native files across clients' LANs and WANs, you come across some difficulties, and slow networks. And large files, like 18GB NSFs that you find out you're pulling across the WAN from Arizona to Michigan. What? Yeah, no one realized that a guy in Michigan had his NSF in Arizona. No wonder he was complaining about email being slow... :) Anyway, aside from that, I've come across lots of situations where a file copy seems to be hung, but you don't want to kill it and try to start over in case it's actually still doing something. The things you want to know in those scenarios are:
* Has the Source already been copied to the Destination
* Has the Source already been hashed
* Has the Destination already been hashed
* Is the process still running/active
* How much longer is it going to take

Even if you could safely preview your destination without risking changing timestamps or otherwise messing something up, it would still show the full file size, regardless of how much has been written, so that's no good. And for the hashing bit, how could you really tell?

Some applications will provide status/progress updates. For instance, Pinpoint Labs' Safecopy will show progress, and even provide an indication of where you are in the process, but it can't tell you if it's still working or not. PixeLabs' XXCopy can provide a progress bar (and used with Jesse Kornblum's md5deep for hashing, which can also provide a progress report), Dan Mares' upcopy (my personal favorite) provides periodic updates, but none of these (or robocopy, RichCopy, etc) are able to tell you what you're actually dealing with. Enter SysInternals' Process Monitor, aka procmon.

You have to know how your tool operates, but if you do, procmon can be extremely helpful for discerning if you're still good to go, where you are in the process, and how much longer you can anticipate it taking (you'll have to do your own math). For my example I'm using upcopy, and copying several 10GB files.

upcopy command

Upcopy works by reading and writing the source to the destination. It then hashes the source, followed by the destination. At that point, it's done. Most apps work in pretty much the same way, but you have to know yours in order to make the best use of procmon.

Naturally, there will be a lot of activity in procmon, so we have to find our executable and set a filter to show only that. Click the binocular icon or CTRL-F to open the search feature. You can also just scroll down to find the exe you're looking for, as the search feature will give you every instance of it, which could give you results from antivirus scanning it, HIPS checking it, parent process, etc. Might take longer than what it's worth that way! If you're not sure the actual EXE name, find it first in SysInternals' Process Explorer.

Process Explorer

procmon, "find"

Once you locate your executable, right-click and select "include..."

procmon, "include exe"

By default, procmon show the most recent activity at the bottom, so you'll need to scroll down. I don't tend to change the configuration, sorting, etc once it's running, as it does use resources. Be aware and judicious in your choices.

The first set of screenshots here show the copy process, from source (F:) to destination (V:) (yes, I'm actually going local to network, the opposite of what we'd typically be doing in a collection; I needed to back up some files to my NAS anyway). You can see the "Offset" in the far right column increasing; this correlates to the file-size, and is one piece in checking progress.

copy progress

more copy progress

Once the copy process is complete, you'll see the activity in the next screenshot. Note the "ReadFile," "END OF FILE" and the "Offset" showing the total file size in bytes (yep it's a 10GB file), followed by the file being closed on source and destination. Then the hashing of the source will commence.
copy complete

You can follow the progress through hashing the same way. All the entries will show "ReadFile" action on the source, and you can keep tabs via the "Offset" value increasing. Note that by doing some math here (change in size over time passed), you can determine an approximate timeframe for completion of that portion.

hashing source

more hashing source

and yet more hashing source

Once it's finished hashing the source, you'll see the following. Just as before, note the "ReadFile," "END OF FILE" and the "Offset" showing the total file size, followed by the file being Closed. Then the process repeats for the destination.

source hash complete

Here's the destination hashing in progress; same as for the source, only the location is changed.

hashing destination

And, just as before, when the destination is finished hashing, you'll see it clearly in the activity in procmon. There's one other item to note here, though, and that is the logfile activity for the hash data.

destination hash complete

Lather, rinse, repeat, and that's about it. Do be aware that running procmon can consume system resources you might otherwise need (it lives off RAM). That's one of the reasons I like upcopy; it's no resource hog at all. And it's freakin' fast. I know Dan Mares thinks people don't use CLI anymore, but I beg to differ!

No great revelation of forensic truth here, just a little something to keep you in the know on what's happening with a file copy. Could also be applied to other processes where data is being written out and taking a long time, or when you need to more accurately calculate time estimates for the same (for some reason those application/system progress reports don't seem to be very accurate). I've found it useful; I hope others do as well.

Happy forensicating!

Saturday, September 3, 2011

Would You Bet Your Life On It? Or Your Company?

It has been said that Information Security is Risk Management, and I agree with that. For any given situation, you have to identify vulnerabilities, threats (ie, "risk"), determine ways to mitigate these, and assign some value to that final level of risk. If that value is gauged to be acceptable to the organization (even if it's your family) then you move forward. But this isn't (or shouldn't be) limited just to Information Security groups - as I mentioned above, the same principle applies at home, on the road, and should also be in the minds of those not actively engaged in InfoSec positions. We live in a time and place in which threats abound, and Information Security is also not about saying "No" to everything; it's about figuring out how to say "Yes" where it's appropriate, and figuring out ways to reduce the risk (this also applies at home, on the road, with our kids, etc).

To keep this from being a totally pensive piece, I'm going to bring it back into the context of the work we do daily. As many of you are aware, a few months ago I experienced an abrupt change in job status, while working in digital forensics consulting. I'm still in a bit of a limbo situation (no, not dancing), but am working a contract gig doing information security. While there are business types that are defined as more at risk of cyber attacks due to industry, I think it should be obvious to everyone by this point that we're ALL under attack. I hear people say things like, "Well, we've never been breached, why do you think it would happen to us?" To that I respond, "You've never been breached? How do you know? Can you prove it?" I personally refer back to Dmitri Alperovitch's statement when talking about Shady Rat that in general he divides companies up as those that know they've been breached, and those that don't yet know.

So what's my point? Well, I'm getting there, albeit a little slowly. My point is that I think people today should have a general awareness of security risks, and that this should occur organically (ie, without having to be told). Even granted that mainstream media doesn't talk about APT, and only mentions the smallest percentages of places that are breached and lose integral control of their data, the info that does get out there should be sufficient. And yet, time and again, people buy cardboard iPads and MacBooks from criminals in gas station parking lots, fall prey to Nigerian email scams, and even fake IRS emails to install malware. But, even if common folk aren't hip to the threat, those in the IT industry will be, right? After all, they've all had to clean up after someone, they follow "geek" news not just mainstream, so they at least will get the fact that there are very real threats out there. Sadly, no.

I was at a presentation recently where a guy who's been working in InfoSec for 20 years told a story about his wife opening one of those IRS emails and following the link. She even put in her social security number when prompted. Then she complained of her computer acting strangely, and told him what happened. He "cleaned" the system by running a scan with an off-the-shelf antivirus/antimalware product, and went on, embarrassed that his wife had fallen prey to a scam. His opinion was that the situation was remediated. Really? You ran an AV scan and that's it? Did you analyze RAM, check network traffic, credit report activity, or do any investigation at all? Nope, just ran an AV scan and called it a day. Wow.

And recently at work we had an internal server that allows certain users to perform certain tasks, return odd results for one user. It was on a Monday morning, and results for that one user all appeared to be in Chinese. Do what? Yep, and just for that one user. We approached the admin about the situation, and as it turns out, on Thursday afternoon of the prior week, the admin for that server had installed some new patch rollups. Patch rollups, not fruit rollups. He felt it was probably related to the patches, as opposed to a compromise. Ok, sounds reasonable, but we still needed to play it safe. We pulled volatile data from the machine and started going through that while the admin investigated the patch scenario. We were quickly informed that the patches were to blame; the admin uninstalled and reinstalled (along with a few more), and said everything was good to go (yes, I realize evidence could've just been stomped on). And indeed, it appeared to be fine, and the explanation made sense. But we asked some followup questions nonetheless, and were greeted with the following response (not an exact quote): "I understand you think you're doing your job, but it was the patches, and it's been fixed. I have a lot of things to do, and don't have time to continue wasting on something that's been resolved." Wow, really? Our boss got involved, and there were some additional conversations...

My question when we received that response was, "Sure, it looks like that's what happened, but can you prove it 100%? Would you bet your life on it? Would you bet the company on it?" Because in essence that's what you're doing by turning and walking the other way, and if you're not willing to bet it all, it's probably the wrong answer. No root cause analysis, and with all the companies falling victim to basic compromises, and allowed to bleed data for who knows how long, and you're willing - as an intelligent IT admin - to say that a system which was serving up Chinese characters is good to go, because of a patch? That seems like a bit of a blind risk. It would have taken so little time to go through our followup questions and answer them, and that would have helped shed great light on the situation, to give us all more peace of mind. I guess I shouldn't be surprised, though. And even if I become so jaded that I'm no longer surprised, I think I still get to be disappointed. ;)

Do I have a solution? I wish I did, but I don't. I understand that education is paramount, but I think it takes more than that. I think it takes an awareness and understanding that there is a clear and present danger (er, threat), and a desire to be part of the solution, rather than part of the problem. And that's what I think is lacking - the desire. The IT guy should already have the knowledge, and the InfoSec guy should have the knowledge. And those are just two examples; I could also talk about the IR guy who has no problem - doesn't even give it a second thought - connecting his laptop up to public wireless. Do we just get complacent and lazy as humans? Or is it that some of us aren't driven and determined to make a difference, and are just trying to get by until it's time to go?

Well, I think that's about all I have. I do want to take a moment to say that there are a lot of folks out there who are driven and determined to make a difference. Just take a look at the blogs I read, for a very small selection. I don't really do the #FF thing on twitter, but I'll give a shout out to Lenny Zeltser as I find his blog extremely practical and helpful. I don't think a post goes by that I don't get something very useful out of it. Thanks for sharing!

For those in the US, have a wonderful Labor Day weekend! For everyone else, get back to work! :) And since our attackers don't honor holiday weekends, be alert; we obviously need more good lerts! :D

Saturday, August 13, 2011

Is Scottish Fiddle like Digital Forensics?

A while back on a job interview, I was asked what I enjoy/how I spend my time besides digital forensics. I of course explained that I play Scottish fiddle and go mountain biking. The guy interviewing me commented that a lot of people in DFIR play music. Now, I'd not heard that before, and I have kind of wondered if it's really that prevalent, and why it might be. Is there something about DF that attracts musicians, or vice versa? And what about Hal Pomeranz's prediliction for dancing? ;) Then recently I thought of a different question - is there a similarity between Scottish fiddle (since that's my music) and digital forensics?

First a little background. I hate to break it to anyone who might think there's a trend, but I'm not a musician, or at least not what I would call one. Other than a little bit of exposure to playing piano as a young child, and playing the recorder in elementary school, I've never played any musical instrument. I certainly wouldn't call either of those experiences the makings of a musician, that's for sure! I wasn't even in band in school; I had some friends there, if that counts for anything, but I got my geek card from the fact that I had a computer. Yep, a Commodore Vic20, and all mine. ;)

I grew up listening to bagpipe music, and really enjoyed it. I thought it would be neat to learn how to play, but knew that pipes are expensive, and extremely difficult to play. As an adult listening to more traditional music, I realized that fiddle also played a prominent role, and in some cases sounded a lot like the pipes. A friend encouraged me to try to learn, as it seemed a less daunting task. What I came to find out is that no decent instrument is inexpensive, and the fiddle is also a very challenging instrument; it is generally considered to take years to learn to play well.

I'm going to take a little side trip here, and refer everyone over to Chris Pogue's blog post, How Do I Get There From Here. I enjoy Chris's blog, and really enjoyed his Sniper Forensics talk at the SANS Summit this year. Something he wrote in this one really resonated, and I thought about it a lot with the question already at hand. He wrote about needing to "get" forensics in order to excel at this work, and about having the drive to succeed. He said the following in regard to his experience breaking in to the field: "I had all of the required skills (networking, Linux, Windows), no different than any of the other applicants. But, what I had that they did not was raw desire. I wanted this job more than anything. I read anything I could get my hands on that dealt with the subject, spent my own money setting up a makeshift lab to play with tools, and perform experiments." He ended up getting the job because of these characteristics.

I can draw some lines between playing a musical instrument and digital forensics. You have to take care of your instrument, keep it clean, in good shape/working order, and so on - that's akin to updating your systems, firmware, laying down new baseline images, configuring software, etc. A good musician also stays in practice, playing every day. They learn new songs, new techniques, and explore their music. Concert musicians certainly warm up (not just musically, but physically as well) before a big performance. So too, we forensicators need to stay in practice, learn new things, and focus our minds (our warmup) before (and during) an investigation. The quality of your instrument (equipment) matters, but a good musician can play a cheap instrument and make it sing. The real power does not rest in the tools you use, but in the skill with which you use whatever instrument you have - starting with your mind. At the core I think they're both a Discipline, and they're both Fun. D+F. DF. Get it? Okay, okay, so I'm a geek. :D

Now to take it a little further and get more specific. Scottish fiddle is not like classical violin. The instruments are the same (yes, a violin and a fiddle are the same thing), but it's the language of playing that's different. Language? Yes, absolutely. Every style of fiddle music - Bluegrass, Old Time, Cajun, Appalachian, Irish, Scottish, Galician - has its own figures of speech, idioms, and nuances of bowing patterns, fingerings, as well as rhythms and tempos. Most of it relates to a dance (Hal, that's your cue). And it is a challenging instrument. It's what's known as a "vocal" instrument because it follows the human vocal range, and there are no frets or keys to guide the player to stay in tune. The player has to be able to hear very closely the changes in pitch, feel the movement of the song and subtle intonations of the music. To do this well, I posit that you have to live it, breathe it, sleep it. If you don't "get" the music, you won't be able to play it well. You'll just be playing the notes; you might be in tune, you might be on time, and you might even have some feeling in it. But you won't really be playing Scottish fiddle, because you don't really understand the language (things like burl, Scots snap, back-bowing). Ever heard a Texan who learned Spanish in high school try to talk? Ay, caramba. Que lastima, pobrecito. Yeah, it's usually pretty bad.

At age 28 with no prior musical background, I took up the task of learning to play Scottish fiddle. I found a guy who enjoyed (generically) Celtic fiddle, and understood the language enough to point me in the right direction while teaching me the foundations of it. I learned what I needed in order to be able to teach myself more. I took lessons for about a year and practiced an average of 6 hours per day for more than a year. I listened to Scottish music all the time until it oozed out of my pores. I started out with a cheap Chinese instrument that wouldn't stay in tune, eventually working up to a decent German one. By the way, I took after mountain biking the same way (on my department store 50-lb bike, again working up to a 26-lb aluminum hardtail), never letting any trail or obstacle daunt me. Before long I was leaving chain-ring scars on 10" logs, doing downhill nose-wheelies around hairpin turns, and climbing root-covered switchbacks without dabbing. And yes a bunch of times I wiped out hard, broke my bike, and limped home battered, bloody, and bruised. But satisified, and happy. When you really want something, and you work hard to achieve a goal, it truly feels wonderful. It's not the tool that makes this possible, it's you.

So what does this mean to forensics? Well, I think it takes a whole lot of determination, guts, and sticktoitiveness (probably not a real word, but you get the point). Like Chris said, you've got to really want it, and make it your life. After some years in IT taking care of small networks and their systems (and users), I really didn't want that any more. In dealing with malware, I'd learned about the registry, prefetch, MRUs, pagefile, hiberfil, RAM, and artifacts like ntuser.dat and user assist keys. I'd worked on hardening systems as a part of protecting the network, performed rudimentary pentesting and security auditing, trying to make sure I'd done my job. I got irritated with XP's shenanigans when it decided I'd changed too much hardware and could no longer be allowed to log on to my machine (even in Safe Mode), so in addition to reinstalling fresh I started dual-booting into Linux. Ha, take that, Windows! That stuff was the fun part, not the rest of the day to day grind. Then I found forensics, and that really piqued my interest! I was graced with the opportuntity to come on as a contractor with a forensics consulting firm to help on the back-end with a large security gig. Like Chris, I devoted myself to learning everything I could about digital forensics. I couldn't afford the good books, so I had to roll drunks at forensic conferences and get their swag (signed books, course material, etc). Just kidding! Really, I went through used bookstores regularly, constantly checking for anything vaguely relevant. I asked questions, practiced, tested, applied whatever knowledge or new technique I found, and just soaked it in. After I was hired on as a permanent employee, I didn't stop, but kept living, learning, and doing more. I've been able to attend some great training, gain a few certifications, and even buy some brand-new books. Side note - if you want to try to roll folks at a forensics conference to get their swag, beware. There's something called the #___smash (name redacted to protect the innocent)! You've been warned.

I've met a lot of really good IT folks, who express an interest in DF, but not enough to go after it. Not to mention when you say something about the registry, they respond with things like, "Sure, but what good is that really going to do?" or just look at you blankly. It takes more than just being "good with computers" to do well in this field. I've used myself as an example because that's the only one I have at hand, but don't think for a second that I'm saying I've "arrived." Not a chance. For everything I've learned, all I know is that I only know enough to know that I don't know the full extent of what I don't know. I consider myself blessed to have "found" digital forensics, to have had people who were willing to take a chance on me, and that we have such a wonderful community of folks who share their knowledge freely; people who break new ground daily and give back every chance they get. Folks everyone knows, like Rob Lee, Harlan Carvey and Chris Pogue. Others such as Kristin Gudjonnson (What, not mention the creator of log2timeline? I might lose my fanboy status!), Ken Pryor, and Jimmy Weg. These are just a few of the many wonderful people that make up our great community. Kudos to you all!

So how is SF like DF? They are both challenging and difficult to learn. Learning and excelling at these crafts takes a lot of determination, drive, patience, and understanding (not necessarily book-learning, but a gut-level perspective). Like Chris pointed out, you have to really want it - I really think this is key. You can never stop practicing and learning (and you'll probably never want to). If you do you'll lose your edge. Finally, they're both incredibly rewarding on a personal level. Nothing like it in the world! There is still the larger question about whether there's some connection between musicians and forensicators. I'm interested to hear others' thoughts on the matter.

I think that's about it. Thanks for reading, and happy forensicating!

Sunday, July 10, 2011

Encrypted Container File Recovery

Scenario:
During a technical interview, I was told definitively, categorically, unequivocally that it was impossible to recover deleted files from within an encrypted container, even if you possess the key. Windows was the OS of choice, and he insisted that he knew from personal experience as they use encrypted containers. He stated that all the file data would be garbled due to the encryption, so it didn't matter if you recovered it or not; the content would not be readable. File wiping was not part of the equation (I asked). This did not sit quite right with me, but I was in the middle of an interview and did not already know the answer, as I had not ever encountered that type of situation. I expressed surprise that that would be the case, but left it at that.

My thoughts:
Pondering it over on the way home, more and more it just didn't make sense to me. If they're created on the file system, the files themselves are not encrypted; the container is encrypted. It has to be mounted in order to access the files. When a file is deleted, the container is mounted and the file is in a readable state. Therefore it should exist in a readable state. If the container is deleted, that's a different matter, although it is feasible that the container could be recovered. All you should have to do is mount the encrypted container, and search within it for the file, which would still exist until overwritten.

I determined to test my theory as soon as possible. Of course, that took a little longer than I initially wanted, as I was “side-tracked” by continuing Dropbox research. However, I outlined a plan, and recently sat down to work through it. Initially, I thought of looking to the host file system, as the OS would provide access to the files through the encrypted container. As I continued the thought process and began testing, I realized this would not be exactly true.

The host OS' file system (ie, MFT) would not reference these files, as they only exist inside the container. The container has its own file system (and MFT), which would be the home for the files' information (and possibly the files themselves, if they were small enough to be resident. Then of course, there's unallocated space. The host OS' could potentially have residue in places like the pagefile or memory, but that should be it.

Now, I don't know what type of encryption they use, but I'm thinking in the end, an encrypted container is going to work the same, regardless of flavor. If I'm wrong, well, then the following is true at least for Truecrypt 7.0a. I approached this research from the standpoint that the files would have been created directly in the container, rather than on the host file system and subsequently moved into the container. Had it been the other way around, there could be artifacts or residue left on the host. I also assumed a fixed container size; with dynamic, the process might end up being a bit more complex.

The process:
The basic idea I had was to create some files inside an encrypted container. Confirm they existed there methodologically, confirm they weren't on the host file system. Then remove/delete them from the container, and try to identify their remains, attempt recovery. So here's the basic outline of my steps and actions; I've tried to retain some order to it and hope it makes sense:

1. Create a 500MB TrueCrypt container.

As a side note, I did this inside Dropbox and had no difficulty; some people have had trouble, and I think it might be related to creating a dynamic container, rather than fixed-size.

2. Create four (4) text files inside the container, filling with specific text from Altheide & Carvey's excellent Digital Forensics With Open Source Tools, since I had just read that.

3. Confirm files' association to host file system
a. Without mounting the container, extracted host system MFT using FTK Imager, mounted with Sanderson Forensics MFTView, exported to XLS & TXT, and searched for filenames. None present, as anticipated.
b. With container mounted, extracted host system MFT, mounted, exported and searched. Still nothing.

4. Confirm files' association to container file system
a. With container mounted, extracted logical drive's MFT (ie, for the container), exported, and searched. Files were identified, as anticipated.
b. As these files were small enough to be resident in the MFT, all content is visible.

5. Move all 4 files out to new location outside of the container(not securely). Obviously, the container is mounted.
a. Extract logical drive's MFT, mount with MFTView, locate resident files.
b. Viewing the hex/text of the entry in MFTView, was able to recover all 4 files to matching hash value.


6. What about larger files not contained in the MFT?
a. Copy entries.log file (4.23KB) - a Dropbox artifact – into container.
b. MFT contains entry but no contents for non-resident file.

7. Delete files (no wiping) from within mounted container and attempt recovery.
a. Mount logical file system in FTKi.
b. INFO2 file contains list of files.
c. Export "Dt1" etc files w/FTKi, hashes match original.
d. Exported MFT and mount in MFTView. Files are resident at “root” of MFT and in Recycler; fully recoverable from both locations.


As a side note, sometimes MFTView seemed to have difficulty displaying the file contents correctly, and thus the extraction of that data to recover the file would result in a hash not matching. This was did not occur all the time, and was observed to happen whether the file was live or deleted. Obviously, the file contents weren't actually stored that way, so it was some programmatic issue. I don't know whether MFTView is a current application or not, as I don't see it listed on the website any longer. I have not (yet) contacted Sanderson Forensics about it, as I don't think it matters for the purpose of this research. An example is below (first with issue, then normal):



Now back to the process...

8. Search host filesystem for files (pagefile, RAM)
a. Copied files back into container, reboot (this flushes my pagefile), open each file
b. Used FTKi, exported host system pagefile.sys and RAM
c. Ran Sysinternals strings (5-character minimum) and output to text files.
d. Loaded & searched in notepad++. Found all 4 test files' contents, plus large portions of entries.log.

Cut away from the outline for a moment. I re-deleted the files, imaged the logical FS and mounted in ProDiscover Basic to search for the files. Found them as they existed in Recycler, to be expected. I purged Recycler by drilling down, selecting all files and deleting again. Keyword searches in Unallocated space weren't turning up the files, and it finally dawned on me that it was because the files were resident in the MFT, so as long as the entries existed there, they wouldn't show up in Unallocated. I needed larger files across the board. So I copied back in the entries.log file, along with a DOC, XLS, and PDF. Deleted (the PDF was too large and was “permanently deleted”) and re-imaged.

9. Load image, search for files (using ProDiscover)
a. Content search: DOC, XLS, and TXT were in Recycler. PDF was in "Deleted" (too big for Recycler). INFO2 listed filenames. The .~Lock file for the DOC was also in Deleted.
b. Cluster search: Two hits on "SIFT" that appear to be related to a PDF file, and contain the filename.

10. Dig a little further (using ProDiscover)
a. Emptied Recycler (drilled down, selected all files and rt-click delete)
b. Re-imaged w/FTKi, reloaded, and Content search. All files found in "Deleted."
c. Searched Clusters, same PDF hit, found TXT file, possibly DOC/XLS (since binary).

11. Final push
a. Carved image with photorec, using default settings.
b. 4 files were recovered, the PDF, TXT, DOC, and XLS. The TXT file is ~1/2 the size, but since there's no header info, it's pretty incredible anyway. All except the text file hash match to the original.


Summary:
This was a fun little exercise, and I think I can categorically, definitively, unequivocally state that it absolutely is possible to recover deleted files from within an encrypted container when you have the key to the container. Obviously, there are variables. If the container size is dynamic, for instance, this could impact things, but I think the odds are still fairly good, and the process is essentially the same. The amount of time that has passed – as with any investigation – is important, but with close proximity it may even be possible to find the files' content (if previously viewed) in pagefile or RAM.

But the core is that it is possible to recover. Knowing the content and the filenames, I was able to easily recover deleted and purged (but not wiped) files from within an encrypted container. I was also able to carve the files without any use of filenames, contents, or type by an automated process. The process could be done more manually by using Sleuthkit or other utilities. Anyway, it can be done, and that's that.

Wednesday, July 6, 2011

Dropbox Forensics Follow-Up

Several months ago I started on a quest to research locally-created artifacts related to the use of Dropbox on Windows systems. This took several months of work as time allowed, in order to complete the outline I was following. This culminated in a blog post on SANS, a more complete article hosted on Forensic Focus, and a summary of artifacts on Forensic Artifacts. However, that's not all I have to offer on the subject. Yes, folks, for a limited time only, when you buy all three you get a fourth for free! That's a $19.95 value, included at no extra cost! (shipping & handling not included; residents of the UK must pay VAT - I know, it sucks)

At the end of the article (hosted on Forensic Focus), I wrapped up with some outstanding items, or possible other things to research. I have spent some more time going over some (only some, not all) of those; this follow-up post will cover my additional research:
1. Does unlinking (local or web) change the registry?
2. What impact does uninstallation have on the registry?
3. What are the various “hash” values; what do they signify?
4. Do the IP addresses vary with geographic area?
5. What data is transferred across the unencrypted connection?
6. Do the SQLite databases contain deleted entries, and how can those be parsed?
7. Are file/system IDs or encoded info stored in the databases, 'entries.log' or elsewhere?

1. Instead of doing ProcMon or RegMon by Sysinternals, I ran regshot 1.8.2 to create snapshots before & after each unlinking. Initially I kept getting BSOD'd every time it would scan the registry but switching systems eliminated that issue. Ultimately there were no obvious registry changes related to the unlinking (local or web).

2. I used regshot before & after the uninstallation as well, and quickly identified 49 deleted entries (truncated here; complete on Forensic Artifacts):

HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\ShellIconOverlayIdentifiers\DropboxExt1\: "{FB314ED9-xxxx-xxxx-xxxx-xxxxxxxxxxxxx}"
HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\ShellIconOverlayIdentifiers\DropboxExt2\: "{FB314EDA-xxxx-xxxx-xxxx-xxxxxxxxxxxxx}"
HKU\S-1-5-21-xxxxxxxxx-xxxxxxxxx-xxxxxxxxx-xxxx\Software\Dropbox\InstallPath: "C:\Documents and Settings\username\Application Data\Dropbox\bin"
HKU\S-1-5-21-xxxxxxxxx-xxxxxxxxx-xxxxxxxxx-xxxx\Software\Microsoft\Windows\CurrentVersion\Shell Extensions\Approved\{FB314ED9-xxxx-xxxx-xxxx-xxxxxxxxxxxxx}: "DropboxExt"
HKU\S-1-5-21-xxxxxxxxx-xxxxxxxxx-xxxxxxxxx-xxxx\Software\Microsoft\Windows\CurrentVersion\Shell Extensions\Approved\{FB314EDA-xxxx-xxxx-xxxx-xxxxxxxxxxxxx}: "DropboxExt"
HKU\S-1-5-21-xxxxxxxxx-xxxxxxxxx-xxxxxxxxx-xxxx\Software\Microsoft\Windows\CurrentVersion\Uninstall\Dropbox\UninstallString: ""C:\Documents and Settings\username\Application Data\Dropbox\bin\Uninstall.exe""
HKU\S-1-5-21-xxxxxxxxx-xxxxxxxxx-xxxxxxxxx-xxxx_Classes\CLSID\{FB314ED9-xxxx-xxxx-xxxx-xxxxxxxxxxxxx}\: "DropboxExt"
HKU\S-1-5-21-xxxxxxxxx-xxxxxxxxx-xxxxxxxxx-xxxx_Classes\CLSID\{FB314EDA-xxxx-xxxx-xxxx-xxxxxxxxxxxxx}\: "DropboxExt"
HKU\S-1-5-21-xxxxxxxxx-xxxxxxxxx-xxxxxxxxx-xxxx_Classes\CLSID\{FB314EDB-xxxx-xxxx-xxxx-xxxxxxxxxxxxx}\: "DropboxExt"
HKU\S-1-5-21-xxxxxxxxx-xxxxxxxxx-xxxxxxxxx-xxxx_Classes\CLSID\{FB314EDC-xxxx-xxxx-xxxx-xxxxxxxxxxxxx}\: "DropboxExt"

I've x'd out some of the SIDs to (hopefully) make it easier to focus, and because I didn't want to post the full SIDs on the internet. I left the first segment for some of the SIDs since that part makes a noticeable, incremental change.

3. There is actually a correlation between "hash" values in the various config files. It should be noted that Dropbox hashes the files in 4MB chunks, and stores the hashes the same way (base64 encoded). Thus, there may be multiple hash values for a single file (but only when it's larger than 4MB). Here's where I've followed the trail of hash:
filecache.db block hash field
entries.log 5th section is hash
sigstore.db stores hash (and size in bytes)

4. I know that some application updates will reach out to different servers based on geographic location, and I wondered if this was the same for Dropbox. Using NirSoft CurrPorts, it was easy to gather the active connections here in Texas. I had reason to take a trip to California, so I did the same thing there. Finally, I established a VPN connection to another country and checked the connections that way as well.

There were some minor variations between the locations for IP addresses, although host names remained largely the same. The one thing that did not change in any of these, was the IP and host name for the sole HTTP (unencrypted to port 80) connection.

5. So then there's the question of this single unencrypted connection. I had not previously examined the content of this traffic, but I have now, using Netwitness Investigator to isolate the connection stream of interest and exporting that out for posterity and more review.

It's basically a "Hello, here I am" and "Let's keep the connection going" type of conversation. Of course, it's in clear text. My only concern is that it transmits the namespace ID (from config.db, root_ns), and possibly that of shared directories as well (there's a second entry that follows the namespace format, but I haven't been able to confirm that yet). With some of the Dropbox-related security issues that have recently come to the surface, I'm a little concerned about this data being transmitted in the clear, especially when I don't know for sure if it can be exploited (and since the IP address and host name are always the same).

6. Deleted entries within the SQLite database files can indeed be recovered. I suspected as much, but I'm not a DB (or SQLite) guru. Historically I've relied on others to develop a tool I can use for this purpose, and I've stuck to my guns in this instance. CCL-Forensics has a product designed for this purpose, called epilog; while it's a commercial product, there is a 7-day trial available.

I must say, it works quite nicely. I removed some files from my Dropbox folder just for this test (relocated to another directory), and then downloaded (have to register, but no sales personnel have contacted me yet), installed, and ran epilog. They have some videos on YouTube, but I found the info I needed in their Help file. There are some different methods to recover deleted entries, but I simply focused on the "Free Page Analysis" which parses the link list or freelist within the database. It very definitely did what I needed it to do.

Edit: I intended to note that to export a report-type of info from Epilog you basically have the option of going to an XML file, which may not be directly what you need. For me, I wanted to look at the data in a spreadsheet. Most methods to convert XML to CSV revolve around going through a couple steps (ie, XSLT), I found XSlicer to be very helpful.

7. And yes, other encoded data does exist within different config files. Dropbox makes use of base64 encoding, and one of the key places is the "entries.log" file located within the ".dropbox.cache" directory inside the user's Dropbox folder. (This set of artifacts is discussed in more detail in the Forensic Focus article.) By cross-referencing with the various parsed database files, I was able to decipher the entries.log (pipe-delimited) file:
1st section is filename (as it exists in .dropbox.cache directory)
2nd section is root_ns/path
3rd section is unix epoch timestamp
4th section is size (bytes)
In addition, the host.db file, 2nd row is user's Dropbox path.

So that pretty much wraps things up. I did not do any research into alternate file transfer methods (I think Dropship has addressed that rather well), but I did note that if you share a file (Public folder) you can get the link to that file; that link can be transferred via email, IM, etc, and the file downloaded by whomever has the link.

Some other resources:
I've already mentioned epilog, which I think has great potential.

There's also Dropbox Reader by ATC-NY; it's a set of python scripts to parse the SQLite files (they pull from the Dropship project). In addition to something like a SQLite Browser this can be very helpful to gather and cross-reference information.

Derek Newton has done some good research, hosted on his blog.
Forensic Artifacts
Security Issues

Great paper on cloud security (with focus on Dropbox) by SBA-Research; the actual download is here

I've mentioned the Dropship project a couple times, but it has been "officially" shut down. Research determined that it was possible to "share" files without using the Public folder, thus potentially facilitating illegal file-sharing. Although Dropship is no longer developed (by the originator) other forks can be found.

I think that's about it, folks. Unless something else comes up to pique my interest (I'm open to suggestions), I think I'm about done with Dropbox research for now. It's been a lot of fun going through this process, and I've learned a lot, which is also good. Hopefully this will all prove useful - to myself and others - in our forensicating efforts.

Wednesday, June 29, 2011

Forensicator: A Definition

Lee Whitfield recently requested definitions for "forensicator" on twitter, as he wanted to submit to Webster's dictionary. It got me to thinking about how to define the word (yes, I enjoy thinking), so here's my stab at it (in something similar to Webster's format):

forensicator
fo-ren-si-ca-tor | adj | \fə-ˈren-zi-kā-tər\

Definition of FORENSICATOR
1 :Individual who understands and enjoys the employment of advanced techniques in the investigation or analysis of artifacts contained within digital media (computers, networks, smartphones, removable/portable storage, etc)
2 :Individual professionally or personally engaged to perform the actions described above
3 :Compliment typically given by one such individual to another

- fo-ren-si-ca-ting | verb

Origin of FORENSICATOR
Coined by BJ Lachner and popularized on the Cyberspeak podcast.
Source: http://computer-forensics.sans.org/community/lethal-forensicator
_________________________________________

Well, that's my contribution. Just a little fun on a Wednesday morning.

Monday, June 27, 2011

Dropbox Forensics Article Hosted

I would consider the short writeup about Dropbox posted on the SANS Forensic blog to be a great success. There was considerable feedback, as well as a number of folks commenting on twitter. I'm glad there was interest and that it was found to be useful; mission accomplished there.

For anyone interested in the full article, it is now on Forensic Focus. Many thanks to Jamie Morris for providing hosting - not just for my research, but for all the others out there as well.

Hope you enjoy it, let me know what you think.

Saturday, June 18, 2011

DFWOST Book Review

Okay, so I promised a book review and here it is. Don't expect more of these, please. They might happen, but that's not my focus. I'm doing this one simply because I wanted to, and I guarantee there will be no forthcoming schedule of reviews, nor of any paradigm shift in this blog.

So the book is Digital Forensics With Open Source Tools by Cory Altheide and Harlan Carvey. I met Cory at the Summit, and he is - as they say - a pretty cool cat when it comes to forensicating. And he is the sole reason that Hal Pomeranz works with Mandiant (at least according to Rob Lee). ;)

Unlike Eric Huber (see his review on Amazon), I did not receive a free copy of the book to review, I didn't win for getting Cory a Monster drink, or any other "gimme" version of the book. I got it the good old fashioned way - I bought it. So I'm doing my part to contribute to the financial wherewithal of the authors. :)

Rob Lee made a point at Summit that the name of the FOR408 course was changed from "Computer Forensic Essentials" to "Computer Forensic Investigations - Windows In-Depth" because the former seemed to be driving folks away. They were apparently concerned that it was "basics" and thus not as valuable. Never mind that (IMO) we need to be constantly reminded of the "basics." As an example of the importance of "basics" the US Army retests soldiers every year in some core competencies including marksmanship and certain tasks that are critical to battlefield survival. Why? Because you have to be ready, you have to remember, and there's no room for error. Mistakes will still happen but the goal is to minimize those as much as humanly possible. I think forensics are very much the same.

Anyway, the point of all that is that I think this book is very easily one of the "Essentials" of computer forensics. Don't get me wrong, there are a lot of other good books out there, and this is by no means a pure beginner's book. However, for someone with some basic understanding, some exposure to the field (in other words, someone who wants to be a forensicator and is doing their due diligence), this is a very good introduction to some of the deeper concepts we deal with. It's also a good refresher. I will admit, I was familiar with most of the topics in this book, but then I have Brian Carrier's masterpiece on file systems, I've been through SANS courses and so on. I will also admit that I learned new things, got some very good tips, and some great ideas from this book.

Here's what I think makes this book so valuable:
1. It walks you through the process of building your own investigative platform in both Windows and Linux, including which "behind the scenes" type of things you need for applications and processes to run smoothly.
2. It doesn't just focus on Windows analysis. It has multiple Operating Systems, File Systems, and ways to get at the data. If you want dedicated Windows analysis, look no further than Harlan's books (well, there are other good ones there, too, so don't take it literally - but you can't go wrong with his for sure).
3. It exposes you to some of the deeper concepts of these systems - inodes and journaling in EXT3, MFT and registry with NTFS, plists and user artifacts in OS X, and browser items of interest across the board.
4. It demonstrates the use of some specific tools - all open source, of course - in various platforms, and explains some of the pros and cons thereof.
5. [fanboy]It has a section on log2timeline. Enough said.[/fanboy] ;)

The authors have carefully limited the scope, not trying to stray too far afield, not digging too deep. I think they did a great job. If you're a newcomer to forensics, it will open your eyes and make you think. It will get you started in new directions and challenge your horizons. If you're a veteran forensicator - even if you know every single thing in this book - it makes an excellent refresher, stirring you up by way of reminder, so that you can remember in greater detail the things you forget because you do them every day, as well as the things you don't.

I think that about sums it up. It's a good read, and well worth it. If you're a fast reader and don't linger long on the examples I think you can wrap it up in a few short hours. If you take longer, stop to smell the roses and whatnot, it'll take a few longer hours, maybe even a couple days. I suggest you take the time, bookmark, highlight, etc to make sure you get the most out of it. Again, it's worth it.

Friday, June 17, 2011

Dropbox Writeup Posted on SANS Blog

The "short" Dropbox writeup I mentioned previously is now posted on the SANS Forensic Blog.

Before too long - hopefully - the full article should be up on Forensic Focus. At the end of that one I listed several things I thought were outstanding in regards to artifacts. I've been working on those, and before too long - hopefully - will be posting those here.

I'm also just wrapping up reading Cory Altheide's book and am going to post a "review" of that as well. Not really into writing reviews, but I think it's worth it.

Friday, June 10, 2011

#DFIRSummit - Afterthoughts, Part 3

Who would've thought it would take 3 posts to summarize the Summit in Austin? Not me. I did the first one because I needed (personally) to start the process; I knew then that the main body would require some dedicated space (it could probably have been broken up into 2 posts just for that piece of it). But there remains something very important to cover - the "thank you" section.

First and foremost, many thanks to SANS and their hard-working people for putting on the event. Obvious thanks go to Rob Lee as the host, but he wasn't the only there. There were people doing registration, audio-visuals, and presentation facilitators. Everyone did a great job, so thanks to all the SANS team!

In addition, there were vendors who helped make things happen. AccessData, Netwitness, and Fortinet all had a presence there (Infogressive was in the program, but I don't recall seeing a booth; conversely, Fortinet was not listed, but was there nonetheless). Netwitness sponsored a lunch & learn, and AccessData sponsored an evening reception.

All of the panelists and presenters also deserve thanks, for giving their time and efforts to be there and participate; I know all the preparations for that take a lot of time and mental effort. Some of them came not just from other States, but all over the world (Iceland, Canada, Nebraska... ;) ).

And last but not least, all the attendees deserve thanks. They took time out of their lives, work, etc to be there. I'm sure it wasn't a burden for anyone, but some of them came a very long way (Spain, Canada, Germany, etc) to be there.

Without everyone listed above, there would be no event. Many thanks to all of you!

I think this is the last post on the subject...

LM

Thursday, June 9, 2011

#DFIRSummit - Afterthoughts, Part 2

Okay, so now we're on to the "real" content. First let me start off by addressing something I overlooked last night. Congratulations go to Eric Huber and his AFoD blog for winning the Forensic 4cast award for "Best Digital Forensic Blog." I know Eric did not anticipate winning, but he did, and deserves it! I must also say that I was sadly disappointed that log2timeline did not win the "Best Computer Forensic Software" category. I'm not the only one; there was a lot of discussion to that effect at the Summit. It seems that Guidance Software had an active internal campaign that paid off more than anything we did for Kristinn. General consensus from the Summit seems to be that l2t was the winner anyway. That's right!

I'm basically going to run through each presentation in order and give a couple tidbits. Any more than that and I'll be here all night! So without further ado...

Day 1

Andrew Hay - 5 Point Palm Exploding Heart Technique for Forensics
This was supposed to be Mike Cloppert's slot, but he was tied up (not literally).
The 5 Points:
Host/Platform forensics
Network forensics
Data Reduction
Corroboration
Orchestration
The overall idea is that you need to try to combine or integrate the various segments into one for more effective/comprehensive investigations, since host-based can no longer really be the primary focus.

Chris Pogue - Sniper Forensics 2.0
DF is constantly changing. We have to be agile & adapt
DF is the most challenging forensics discipline because of the changes
The software tools you use in an investigation don't matter - your brain is your best tool.
You have to have a plan - this is *key* (and your steps should be consistent)
CLI is your friend. Yay, Chris! :)

Sean Morrissey - iOS Forensics
I have used Lantern and tend to prefer it over Mobilyze. However, I really would have liked more info about "iOS Forensics" (ie, important artifacts and how to use them) than a presentation about Lantern.
Putting an iPhone in airplane mode does not disable WiFi. So if you are acquiring one, remove the SIM, put in AM, disable WiFi & bluetooth, and use a Faraday bag if need be.
To recover/carve deleted entries from SQLite db, look for "de-referenced" items.

NetWitness Lunch&Learn (I think the presenter was Michael Sconzo, from their CIRT)
It was technical, not a sales pitch, and very much about results of network investigation for malware, as opposed to what NetWitness can do.
The main idea was to know what "good" or "benign" http sessions look like so you can quickly recognize anomalies. I think he actually mentioned something about reading RFC 2616; I don't remember anything after that point... Just kidding; it was very informative.

Hal Pomeranz - EXT3 File Recovery via Indirect Blocks
What can I say - you give Hal a command line, a hex editer, a Linux file system, and he just starts dancing!
File-carving assumes 100% contiguous data...
Indirect block pointers are not nulled out when a file is deleted (unlike direct pointers).
When decoded, they will point to the next block #.
Hal has some tools to automate the process of recovery, rather than manually follow the indirect pointers; it basically runs on top of TSK and calls those utilities as it needs:
frib (file recovery indirect blocks) - this works if you know where the file started, and can progress forward from there.
fib (find indirect block) - finds indirect block (by signature, within the block grouping you're targeting), then counts back 12 blocks to what should be the start of the file.
He has a whitepaper and the tools on Mandiant's blog

RMO's were handed out by Rob Lee, to:
David Kovar - for AnalyzeMFT
Bamm Vischer - for sguil
Congratulations, guys!

Terry Maguire - IR Process & Smart Phones
As these phones become more common in the enterprise, we have to know how to handle them.
**Note: both android and iOS use a lot of SQLite db files.
-sqlite browser (sourceforge) is good, but no deleted entries will show
-epilog by CCL Forensics is designed to show deleted entries (not free, commercial product)
Android must be rooted to get access to any real information. This requires modifying the phone, if if you use something z4root that can be undone with the click of a button.
In order to get volatile data from iPhone, it will have to be jailbroken.
Blackberry cannot be imaged like other devices; removing & imaging chips might be possible. Blackberry file system can be mounted either through desktop manager or javaloader, but be careful; it's easy to destroy data! Blackberry Messenger SMS are not contained in IPD files; they can only be collected from mounted file system.
ABC Amber Blackberry Converter is now Backup Blackberry Explore by Elcomsoft.

Mike Cloppert - Distinguishing IR from Computer Network Defense
He's in Andrew Hay's original slot.
APT & such are much more advanced than the traditional IR models developed a decade ago:
Highly aware (situational awareness)
Adaptive
Lots of tools
There may be multiple adversaries/attack vectors simultaneously or near-simultaneously.
Campaigns (by adversaries) may span several years.
The conventional IR model is based on the presumption of a successful compromise. If it's an "imminent threat" the model doesn't fit. The model is reactive, not proactive. Needs to be more proactive.
Have a monthly overview of reporting to help determine where to focus preventive efforts.

Day 2

Kristinn Gudjonsson - log2timeline
version 0.60 - the "killer dwarf" release - now works on Windows; instructions on how to install in docs/install (Chris Pogue created/tested documentation).
Rewritten engine, work is done on back-end.
It is more object-oriented, and has preprocessing modules.
With the front-end not doing processing, you can easily build your own, for integration into your own processes, customize default action, etc.
It now has a Skype parser. It includes code from regripper and regtime to automatically pull in all the registry data. And (drumroll, please) David Kovar's AnalyzeMFT has been imported as well, to parse the MFT. Of course, that means it had to go from python to perl, but we won't get into that.

Mike Pilkington - Protecting Privileged Domain Accounts during Live Response!
Mission: remote access to WinXP (SP2) workstation (no patches) for analysis/triage
wmic
psexec
netuse
You don't want attackers who may be present to capture privileged credentials.
Do not use any type of interactive logon as this will cause a password hash to be stored locally. Running psexec creates a vulnerability for delegate-level access token theft. Don't set IR accounts as admin accounts; put them into different groupings and give those elevated privileges only as needed.

Panel: Professional Development in Digital Forensics and Incident Response
Lenny Zeltser, Richard Bejtlich, Ken Dunham, Joe Garcia, Bamm Visscher
Everyone had pre-formatted questions they spoke about, then it was open to questions from the audience. I will touch on one, for Richard: How do I build a computer incident response team? I thought the absolute key to it was his statement that you have to keep the groups tightly-knit and give the analysts what they need to do their jobs - training, equipment, etc. The best part was that he said you have to protect them fiercely. That's leadership! He had a blog post about this recently; it's obviously important to him.

Lee Whitfield - Digital Forensics and Flux Capacitors
Looking at reasons/ways people try to get out of trouble with their computers
Focus: Time/system clock alteration (as an excuse)
Top places to check at start of investigation
system event logs (except on XP, where it's not as important)
$UsrJrnl.$J
LNK files
Restore Points
Who is @gingerlover_17 Lee? ;)

Hal Pomeranz - EXT4: Bit by Bit
Changes in EXT4
48-bit address space
Uses extents instead of indirect block chains
64-bit nanoseconde resolution timestamps
File creation time timestamp (born, or b-time)
Backwards compatibility design goal
Inodes expanded to 256 (from 128)
Most of offsets listed in carrier's book still apply to ext4
Hal dove right in with his hex editor, heads exploded, Hal danced, twitter was on fire, etc. It was a very good presentation!

Panel: Forensics in the New Cloud Frontier
Andrew Hay, Cory Altheide, Joe Garcia, Robert Lee, Ed Skoudis
The questions were sprung on the panelists w/o preparation. Wow.
Here's my take: The cloud is here. It's not leaving. You need to know what kind of alerts your cloud provides (to indicate compromise/issue, like gmail's alerts to different locations accessing your account). Distributed processing is going to be key to future analysis (think multi-GB log files). Make sure your cloud provides you with auditing capabilities, as logs are going to be the target of your analysis. Look at the kind of data you've needed from recent incidents, and see if you can get that from your cloud.
Then it was opened up to the audience's questions, including:


#dfirsummit Q for panel: Would you get a 4Cast award for staying within a reasonable budget while proactively responding using sniper forensics, five point palm methodology and log2timeline to analyze a mobile device running ext4 whose clock was reset using false domain credentials through the cloud?

Does that question not totally sum it up?

Oh, there was one more panel, the vendor panel. I had to leave right before that, so that's where my summary falls short. However, I think the last question for the previous panel is the best place to end...

LM