Tuesday, October 27, 2009

feed finder in php

Here is a php snippet which finds the rss and atom links of a web site by parsing the meta information in the header section of the page.




Friday, October 23, 2009

feed finder in python

Here is a python code snippet which finds the RSS link in any web site...


import sys
from urllib2 import urlopen
from urlparse import urljoin
from HTMLParser import HTMLParser, HTMLParseError

class FeedAutodiscoveryParser(HTMLParser):
# These are the MIME types of links accepted as feeds
FEED_TYPES = ('application/rss+xml',
'text/xml',
'application/atom+xml',
'application/x.atom+xml',
'application/x-atom+xml')
def __init__(self, base_href):
HTMLParser.__init__(self)
self.base_href = base_href
self.feeds = []
def handle_starttag(self, tag, attrs_tup):
tag = tag.lower()
attrs = dict([(k.lower(), v) for k,v in attrs_tup])
if tag == "base" and 'href' in attrs:
self.base_href = attrs['href']
if tag == "link":
rel = attrs.get("rel", "")
type = attrs.get("type", "")
title = attrs.get("title", "")
href = attrs.get("href", "")
if rel == "alternate" and type in self.FEED_TYPES:
self.feeds.append({
'type' : type,
'title' : title,
'href' : href
})
def getFeedsDetail(url):
data = urlopen(url).read()
parser = FeedAutodiscoveryParser(url)
try:
parser.feed(data)
except HTMLParseError:
pass
for feed in parser.feeds:
feed['href'] = urljoin(parser.base_href, feed['href'])
return parser.feeds
def getFeeds(url):
return [ x['href'] for x in getFeedsDetail(url) ]


def main():
url = sys.argv[1]
feeds = getFeedsDetail(url)
print
print "Site %s : " % url
print "###########################################"
print
for feed in feeds:
print "Title : '%(title)s' \nType : %(type)s \nURI : %(href)s" % feed
print "------------------------------------------------------------------------"
print

if __name__ == "__main__":
main()


The use is...


F:\Python26>python minifeedfinder.py http://www.timesofindia.com/

Site http://www.timesofindia.com/ :
###########################################

Title : ''
Type : application/rss+xml
URI : http://www.timesofindia.com/rssfeedsdefault.cms
------------------------------------------------------------------------


F:\Python26>python minifeedfinder.py http://asitdhal.blogspot.com/

Site http://asitdhal.blogspot.com/ :
###########################################

Title : 'Life like this - Atom'
Type : application/atom+xml
URI : http://asitdhal.blogspot.com/feeds/posts/default
------------------------------------------------------------------------
Title : 'Life like this - RSS'
Type : application/rss+xml
URI : http://asitdhal.blogspot.com/feeds/posts/default?alt=rss
------------------------------------------------------------------------


The equivalent php code is in the following link...
http://kodeyard.blogspot.com/2009/10/feed-finder-in-php.html

Saturday, October 17, 2009

Country and City from ip address (php)

Here is a code, I made from scratch to get the geographical information from ip address.



This code does not implement any validation of address. As it depends upon whois server to perform lookup, the result takes time to show output.

Wednesday, October 7, 2009

link extractor in python

I my engineering, I coded a python script that will extract links from a web page.
Here is the code...


import urllib
import sys
import os.path
import sgmllib


print "\n\n\t\tlipun4u[at]gmail[dot]com"
print "\t\t------------------------"

appname = os.path.basename(sys.argv[0])

class MyParser(sgmllib.SGMLParser):
"A simple parser class."

def parse(self, s):
"Parse the given string 's'."
self.feed(s)
self.close()

def __init__(self, verbose=0):
"Initialise an object, passing 'verbose' to the superclass."

sgmllib.SGMLParser.__init__(self, verbose)
self.hyperlinks = []

def start_a(self, attributes):
"Process a hyperlink and its 'attributes'."

for name, value in attributes:
if name == "href":
self.hyperlinks.append(value)

def get_hyperlinks(self):
"Return the list of hyperlinks."

return self.hyperlinks



if len(sys.argv) not in [2,]:
print "Usage : " + appname + " "
print "e.g. : " + appname + " www.google.com "
sys.exit(1)
elif "-h" in sys.argv:
print "Usage : " + appname + " "
print "e.g. : " + appname + " www.google.com "
sys.exit(1)
elif "--help" in sys.argv:
print "Usage : " + appname + " "
print "e.g. : " + appname + " www.google.com "
sys.exit(1)



site = sys.argv[1].replace("http://","")
site = "http://" + site.lower()

print "Target : " + site
try:
site_data = urllib.urlopen(site)
parser = MyParser()
parser.parse(site_data.read())
except(IOError),msg:
print "Error in connecting site ", site
print msg
sys.exit(1)
links = parser.get_hyperlinks()
print "Total no. of hyperlinks : " + str(len(links))
print ""
for l in links:
print l


Here is the help file

 

I:\Python26>linkscan1.py


lipun4u[at]gmail[dot]com
------------------------
Usage : linkscan1.py
e.g. : linkscan1.py www.google.com

I:\Python26>linkscan1.py www.iter.ac.in


lipun4u[at]gmail[dot]com
------------------------
Target : http://www.iter.ac.in
Total no. of hyperlinks : 12

http://iter.ac.in
default.asp
contactus.asp
http://iter.ac.in:8383
time-table.xls
http://www.soauniversity.ac.in/saat_2009.htm
images/advertisement_Saat2009.gif
#
#
#
#
http://www.allindiaonline.in/

I:\Python26>


But some guys added some spice to it and look what they made...


key logger in C++

Now time to do some bad work. Here I will give you the code of a key logger that works fine in windows NT platform. The code is not mine unlike the previous one. The original link is http://www.rohitab.com/discuss/index.php?showtopic=19360.

Before this, read this link to know what this stuff is ???
http://www.amolenuvolette.it/root/kulture/keylogger.txt


Here goes the code...



Compile it in Microsoft Visula Studio 6.0. I don't know if it can be compiled in any other compiler. As you run, it will log all the keys pressed in keys.log file in the same folder in which the executable file is present.You will see the console window as no code added to hide this.

No add some code in the main function to make it invisible...




Now the key logger is ready to run. You want to know how to start this during windows start up..http://packetstormsecurity.org/Win/auto.txt

N.B. This is for educational purpose. If anyone gets busted by using this code, I won't be responsible.

Sunday, October 4, 2009

timeout in Session (PHP)



  • Sessions allow the PHP script to store data on the web server that can be later used, even between requests to different php pages.


  • When a session is created, a flat-file is created on the server. Since the session ID is a unique identifier, those session files will accumulate over time.


  • The PHP garbage collector deletes old files from time to time. But the garbage collector is invoked with a certain probability, not every time the web server runs.


  • The default timeout for session files is 1440 seconds or 24 minutes. So a session file can be deleted after that timeout, but it may reside on the server longer, depending on the amount of sessions created - here comes the probability into the game.


  • The session may reside in server with a lifetime until the browser is closed, but the garbage collector might delete the session file much earlier. In this case, and if there is a session request after the session file has been deleted, a new session is created and the old session information is lost. This is annoying.


  • There are 3 variables described in PHP.ini file, which deal with the garbage collector





    Variabledefault valueChangeable
    session.gc_maxlifetime1440 secondsPHP_INI_ALL
    session.gc_probability1PHP_INI_ALL
    session.gc_divisor100PHP_INI_ALL

    session.gc_probability along with session.gc_divisor is used to manage probability that the gc (garbage collection) routine is invoked. The probability is calculated by using gc_probability/gc_divisor.


  • The garbage collection timeout can be changed.

    $timeout = 7200; // 7200 seconds = 2 hour
    ini_set('session.gc_maxlifetime', $timeout);




  • Session timeout can be reduced without changing the global variable programmatically .

    session_start();
    // set timeout period in seconds
    $inactive = 600;
    if(isset($_SESSION['timeout']) ) {
    $session_life = time() - $_SESSION['timeout'];
    if($session_life > $inactive) {
    session_destroy(); header("Location: logoutpage.php"); }
    }
    $_SESSION['timeout'] = time();




Friday, October 2, 2009

login page in PHP(naive)

This example shows how to design the login page of a web site in php for naive programmers.

First create the database(I am using mysql)


CREATE TABLE login (
user_id INT AUTO_INCREMENT NOT NULL,
user_name VARCHAR(25) UNIQUE NOT NULL,
password VARCHAR(16) NOT NULL,
PRIMARY KEY(user_id)
);


Now, let's insert some data.

INSERT INTO login
(user_name, password)
VALUES
('asit', 'lipu');

INSERT INTO login
(user_name, password)
VALUES
('google', 'yahoo');


Now this is the html page that displays the login page.

login.html




Now let's make the php code that makes the necessary database connection..

database.php



$hostname = "localhost";
$username = "root";
$password = "iitiit";
$database = "db2";
$link = mysql_connect($hostname, $user, $password)or die("Mysql con't be connected");
mysql_select_db($database, $link) or die("Database can't be connected");
?>


As the user enters the username and password, the requested information is sent to the server and the login.php script will be invoked.
Here is the code

login.php




If the user successfully logs in, then the page is redirected to welcome.php, otherwise to error.html

welcome.php




error.html




Few unknown facts

  • No website in this world uses this technique as the login information is not encrypted.

Thursday, October 1, 2009

Euclidean Algorithm(GCD)

Greatest Common Divisor can easily be calculated easily using Euclidean Algorithm.

This is the non recursive pseudo code







function gcd(a, b)
while b ≠ 0
t := b
b := a mod b
a := t
return a



int gcd(int a, int b)
{
if (a == 0)
return b;
while (b != 0)
if (a > b)
a = a - b;
else
b = b - a;
return a;
}


This is the recursive pseudo code







function gcd(a, b)
if b = 0
return a
else
return gcd(b, a mod b)



int gcd(int a, int b)
{
if (b==0)
return a;
else
return gcd(b, a%b);
}

get IP address in C(windows)

I have written this code snippet in my 2nd year which finds the ip address of windows machine. This is a simple code. Just go throw this...

explicit keyword in C++

Look at the following code.




Here ABC a=100 is equivalent to ABC a(100);
This is known as an implicit conversion. This reduces the readability of the code. This can be avoided by using the keyword explicit.

By prefixing the constructor with the explicit keyword, we can prevent the compiler from using that constructor for implicit conversions.

Look at the following code.


In this case ABC a=100 will be an error. We can only call this by using the constructor notation.

Some more information about explicit...

  • The explicit keyword is used to declare a single-argument constructor that can only be called explicitly. If the constructor takes multiple argument, it's use is useless.

  • It is only used in declarations of constructors within a class declaration.