As security researchers and pentesters know, Information Gathering has been overlooked by some, and not given the proper attention it deserves. Nevertheless, it remains to be a vital phase in the pentesting process.
This blog post will give different tools to do basic recon in a pentest engagement since no one only relies on one tool. More advanced recon techniques will be covered in part 2 of this blog.

The Tools-set Under Different Categories

Subject 1. Sub domain Enumeration.

In most cases, say in a bug bounty play, most vulnerabilities may not lie in the main domain. Sub domain hunting comes in handy. Lets look at some ways of sub domain enumeration and discovery.

a. Knockpy

Knockpy is a handy tool for this purpose. It uses a wordlist that can be customized to fit your target attack.

b. Sublist3r

As a tool mentioned by pentesters and bug bounty hunters all over the internet, this is a must try.
Sublist3r relies purely on OSINT techniques. It crawls different search engines including Google, Baidu, Yahoo, Ask etc. Sub domain enumeration also possible via DNSdumpster, Netcraft, Virus total among others.

c. Google dorks

Google as the most popular search engine caches all sorts of websites. This makes it a good tool to find sub domains visited. We just need to know how to ask. Using ‘google.com’ as an example, we can easily do this, exposing the sub domains.

Some good scripts also exist that automate google dorking. Here are 2:

GoogD0rker

This one automatically launches a series of queries against the specified target. Great OSINT tool. The tool is able to find documents, login pages, backdoors, files by extension, pastebin posts, subdomains etc.
Download it here.

GooHak

Similar to the above. Find it here.

d. Amass

An OWASP tool for sub domain discovery that uses multiple sources to do this. More info can be found on their git page.

e. Curl one liner

This is a cool script i found on twitter from Ben Sadeghipour‘s tweet. Its pretty simple and uses archive.org to scrape the sub domains.

==>”curl -s “https://web.archive.org/cdx/search/cdx?url=*.testfire.net/*&output=text&fl=original&collapse=urlkey” |sort| sed -e ‘s_https*://__’ -e “s/\/.*//” -e ‘s/:.*//’ -e ‘s/^www\.//’ | uniq“<==

f. Confirm live domains

During my hunts, i found out a number of the domains discovered from tools that do mass scraping do not resolve. In this case, i wrote a simple bash script that given a text file with all the valid sub domains, goes through them all and tries to resolve them and find out which ones don’t. Download it here.

Subject 2. Web Server Fingerprinting.

culprit: HTTP Methods

a. Curl

Curl is a pretty powerful CLI tool. Despite being used by pentesters to exploit file inclusions (RFI, LFI), command injections, HTTP file uploads etc, it can also be used to identify a HTTP methods allowed on the server. Some servers however have OPTIONS disabled, we can use HEAD instead.

Dangerous methods like TRACE and PUT should not be allowed. On exploitation of PUT, check out NMAP scripts, tools like burp and browser add-ons like poster.

b. NMAP

Weaponizing nmap scripts can come in handy.

c. My rudimentary curl script

I wrote this simple script to print out the response headers for a list of servers in a text file. However if the OPTIONS method is enabled on the server, we can get the list of allowed methods on the server. See it here. This includes a simple http(s) check using wget for the list of servers. As usual, this can be improved/modified.

N/B. netcat, nikto can also be used for this.

culprit: Application Mapping

In an attempt to attack an application, we have to understand its working, architecture and underlying technologies.

Identifying technology used

Wappalyzer

This is a browser extension that identifies an applications underlying technologies. This may include the language used, development frameworks, CMS, analytics frameworks etc. It runs on both firefox and google chrome.

Whatruns

Another browser plugin that works the same way. Just a bit more aggressive.

WafW00f (Firewall discovery)

So we want to actively interact with the target. However different probes might get blocked by a possible security solution like a WAF. If so we can identify the WAF in use by using sandrogauci ‘s tool, WafW00f that can be found here.

Content discovery

dirb

Very comprehensive directory/file bruteforce tool that uses a custom word list to find the directories or files that exists. This happens to be my favorite.

dirsearch

Similar to dirb but with some fancy colors for easier status identification. Searches for both files and directories as well. This has the ability to specify extensions. e.g php, txt, rar, zip etc.

dir buster

Another one by OWASP. With a cool looking GUI, it does file and content discovery with an option to specify custom word list. Also comes with a cool set of word lists. Can be found here

nikto

From banner grabbing, header analysis, light default directory/file discovery, nikto is pretty handy. Also offers some suggestions and advisory info for why the discovered issues are dangerous.

Aquatone

It gives a visual representation of the websites listed on a text file. This helps easily map out the best attack surface. For example, makes it easy to find login pages without manually visiting the pages. It takes screenshots of the pages and saves them to a folder. Also includes headers.
It also has other modules i.e. scan, discovery, gather, takeover that will be discussed on part 2 of this blog.

burp intruder

Burp’s intruder also serves as a multipurpose tool. In this context it can be used to bruteforce files, directories, GET params etc while observing status codes as well as content length.

The perfect wordlist for the job

All these tools wont give a heavy punch without a good set of word lists. From my research i discovered seclist. Probably as comprehensive as it gets. Coupled with different usernames, passwords, URLs, payloads etc. It earns the ‘ultimate wordlist’ title.

N/B: Discovering ‘hidden’ GET/POST parameters.

During pentesting or bug bounty hunting, the best way to attack a page is on inputs. Hence parameters are really important. If we cant find them on the first look, its possible to try and find the ‘hidden‘ parameters using different tools.

Arjun

One tool by UltimateHackers comes to mind. Arjun is a script that helps bruteforce these parameters using a word list that can be customized.

Parameth

This one worked for me a while back. Does the same. But i still prefer Arjun.

culprit: Other search engines

While google might be the most popular search engine, its not the only one. And if you’re gonna be finding vulnerabilities, then you most likely need these 2 as well…

shodan (shodan.io)

Hands down the ultimate IoT search engine. Same as Google, it uses ‘dorks’ for smarter searches and improve on what it finds. Its right here.

Some common dorks may include:

country: find devices in a certain country

hostname: find devices matching the given hostname

port: find devices on given open ports

os: return results that match the given OS

before/after: find results within a given time frame

city: find devices in a certain city

Censys (censys.io)

Kinda like shodan, it compares to the fact that it can also search for devices accessible from the internet.
Lets find debian servers running ssh from Africa using the query

“22.ssh.v2.metadata.product:”OpenSSH” AND metadata.os:”Debian” AND location.continent:”Europe”

Have fun with the 100 ways of discovery. Part 2 coming soon.

100 ways to discover (part 1)