Monday, October 12, 2009

Exporting bugs from Gforge.

At times you need to export data from a list of bugs for reporting purposes. A not-suggested way of One way of doing that is to export data from Gforge data tables using sql.
A quick way is to perform screen scrapping. I am giving to give example of this approach.

Here are the steps.

1. Log into Gforge account.
2. Get the details for a bug.
3. Fix the HTML to make a xhtml.
4. Use xslt transform to transform the xhtml into your desired format.


1) Login to GForge account.
You need to use wget to login. The wget invokation should be like specified below:
$ wget --keep-session-cookies --no-check-certificate --save-cookie=saved.cookie --post-data="return_to=%2fmy%2f&form_loginname=YourLoginName&form_pw=YourPassword2&login=Login" "https:/www.yourgforgeserver.com/account/login.php?return_to=%2Fmy%2F"
if you need to supply a http digest password then you should add this stanza in the beginning.
--auth-no-challenge --http-user=YourDigestUserName --http-password=YourDigestPassword
After this command the cookies will be saved in saved.cookie file.

2. Get the details for a bug.
$ export BUG_ID=7777
$ wget --keep-session-cookies --no-check-certificate -O $BUG_ID.details.html --load-cookie=saved.cookie "https://www.yourgforgeserver.com/tracker/?func=detail&aid=$BUG_ID&group_id=15&atid=165"
Note: The group_id and atid will be different for your project. Please specify correct values to get the details page.

3. Fix the HTML to make a xhtml.

The html generated by gforge is not wellformed so you need to clean it up. Use tagsoup to achieve this. Download the jar file. I used tagsoup version 1.2 from this (http://home.ccil.org/~cowan/XML/tagsoup/tagsoup-1.2.jar)location .
$wget http://home.ccil.org/~cowan/XML/tagsoup/tagsoup-1.2.jar
--2009-10-10 23:19:40-- http://home.ccil.org/~cowan/XML/tagsoup/tagsoup-1.2.jar
Resolving home.ccil.org... 192.190.237.100
Connecting to home.ccil.org|192.190.237.100|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 90023 (88K) [application/java-archive]
Saving to: `tagsoup-1.2.jar'


100%[===============================>] 90,023 248K/s in 0.4s


2009-10-10 23:19:41 (248 KB/s) - `tagsoup-1.2.jar' saved [90023/90023]

Now invoke this program to fix the HTML file.
$java -jar tagsoup-1.2.jar --nons --files 7777.details.html
src: 7777.details.html dst: 7777.details.xhtml

4. Use xslt transform to transform the xhtml into your desired format.
Create a xsl file to pick data from this xhtml file and create the output. A sample xsl file (names starter.xsl) can be as shown below:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="/">
<html><head><title><xsl:value-of select="//html/body/table[2]/tr[3]/td[2]/table[1]/tr[3]/td[2]/table[1]/form[1]/input[3]/@value"/></title></head>
<body>
<br/><xsl:text>BugId: </xsl:text>
<xsl:value-of select="//html/body/table[2]/tr[3]/td[2]/table[1]/tr[3]/td[2]/table[1]/form[1]/input[3]/@value"/>
<br/><xsl:text>Summary: </xsl:text>
<xsl:value-of select="//html/body/table[2]/tr[3]/td[2]/table[1]/tr[3]/td[2]/h2"/>
<br/><xsl:text>Resolution: </xsl:text>
<xsl:value-of select="//html/body/table[2]/tr[3]/td[2]/table[1]/tr[3]/td[2]/table[1]/form[1]/tr[3]/td[1]/select/option[@selected='selected']"/>
<br/><xsl:text>Severity: </xsl:text>
<xsl:value-of select="//html/body/table[2]/tr[3]/td[2]/table[1]/tr[3]/td[2]/table[1]/form[1]/tr[3]/td[2]/select/option[@selected='selected']"/>
<br/><xsl:text>Assigned to: </xsl:text>
<xsl:value-of select="//html/body/table[2]/tr[3]/td[2]/table[1]/tr[3]/td[2]/table[1]/form[1]/tr[8]/td[1]/select/option[@selected='selected']"/>
<br/><xsl:text>Priority: </xsl:text>
<xsl:value-of select="//html/body/table[2]/tr[3]/td[2]/table[1]/tr[3]/td[2]/table[1]/form[1]/tr[8]/td[2]/select/option[@selected='selected']"/>
<br/><xsl:text>State: </xsl:text>
<xsl:value-of select="//html/body/table[2]/tr[3]/td[2]/table[1]/tr[3]/td[2]/table[1]/form[1]/tr[9]/td[1]/select/option[@selected='selected']"/>
<br/><xsl:text>Details: </xsl:text>
<xsl:value-of select="//html/body/table[2]/tr[3]/td[2]/table[1]/tr[3]/td[2]/table[1]/form[1]/tr[11]/td[1]/table[1]/tr[1]/td[1]/table[1]/tr[2]/td[1]/pre[1]"/>
</body>
</html>
</xsl:template>
</xsl:stylesheet>



Now you can invoke xsltproc on these files:
$ xsltproc starter.xsl 7777.details.xhtml > 7777.out.html


Notes:
1. It is better to create text files instead of html because line breaks in detailed description are lost unless you escape them properly.
2. You can use lynx -dump 7777.out.html to create a text version of the html file.