Skip to content Skip to sidebar Skip to footer

Windows Batch / Parse Data From Html Web Page

Is it possible to parse data from web html page using windows batch? let's say I have a web page: www.domain.com/data/page/1 Page source html: ...

The batch language isn't terribly well-suited to parse markup language like HTML, XML, JSON, etc. In such cases, it can be extremely helpful to use a hybrid script and borrow from JScript or PowerShell methods to scrape the data you need. Here's an example demonstrating a batch + JScript hybrid script. Save it with a .bat extension and give it a run.

@if (@CodeSection == @Batch) @then
@echo off & setlocal

set "url=http://www.domain.com/data/page/1"

for /f "delims=" %%I in ('cscript /nologo /e:JScript "%~f0" "%url%"') do (
    rem // do something useful with %%I
    echo Link found: %%I
)

goto :EOF
@end // end batch / begin JScript hybrid code

// returns a DOM root object
function fetch(url) {
    var XHR = WSH.CreateObject("Microsoft.XMLHTTP"),
        DOM = WSH.CreateObject('htmlfile');

    XHR.open("GET",url,true);
    XHR.setRequestHeader('User-Agent','XMLHTTP/1.0');
    XHR.send('');
    while (XHR.readyState!=4) {WSH.Sleep(25)};
    DOM.write('<meta http-equiv="x-ua-compatible" content="IE=9" />');
    DOM.write(XHR.responseText);
    return DOM;
}

var DOM = fetch(WSH.Arguments(0)),
    links = DOM.getElementsByTagName('a');

for (var i in links)
    if (links[i].href && /\/post\/view\//i.test(links[i].href))
        WSH.Echo(links[i].href);

Solution 2:


Post a Comment for "Windows Batch / Parse Data From Html Web Page"