Skip to content Skip to sidebar Skip to footer

Need To Parse Out String From Html Document In A Batch File

I tried searching but couldn't find anything anything specific to what I need. This is an excerpt from my HTML file:

Solution 1:

Use Windows Scripting Host (VBscript or JScript). Use the htmlfile COM object. Parse the DOM. Then you can massage the innerText as needed with a regexp.

Here you go. Save this as a .bat file, modify the set "htmlfile=test.html" line as needed, and run it. (Derived from this answer. Documentation for the htmlfile COM object in WSH is sparse; but if you would like to learn more about it, follow that bread crumb.)

@if (@CodeSection == @Batch) @then

@echo off
setlocal

set "htmlfile=test.html"

rem // invoke JScript hybrid code and capture its outputfor /f %%I in ('cscript /nologo /e:JScript "%~f0" "%htmlfile%"') do set "converted=%%I"

echo %converted%

rem // end main runtime
goto :EOF

@end // end batch / begin JScript chimeravar fso = WSH.CreateObject('scripting.filesystemobject'),
    DOM = WSH.CreateObject('htmlfile'),
    htmlfile = fso.OpenTextFile(WSH.Arguments(0), 1),
    html = htmlfile.ReadAll();

DOM.write(html);
htmlfile.Close();

var scrape = DOM.getElementById('pair_today').getElementsByTagName('h1')[0].innerText;
WSH.Echo(scrape.match(/^.*=\s+(\S+).*$/)[0]);

You know, as long as you're invoking Windows Script Host anyway, if you're acquiring your html file using wget or similar, you might be able to get rid of that dependency. Unless the page you're downloading uses a convoluted series of cookies and session redirects, you can replace wget with the Microsoft.XMLHTTP COM object and download the page via XHR (or as those with less organized minds would say, Ajax). (Based on fetch.bat.)

@if (@CodeSection == @Batch) @then

@echo off
setlocal

set "from=%~1"
set "to=%~2"
set "URL=http://host.domain/currency?from=%from%&to=%to%"for /f "delims=" %%I in ('cscript /nologo /e:jscript "%~f0" "%URL%"') do set "conv=%%I"

echo %conv%

rem // end main runtime
goto :EOF

@end // end batch / begin JScript chimeravar x = WSH.CreateObject("Microsoft.XMLHTTP"),
    DOM = WSH.CreateObject('htmlfile');

x.open("GET",WSH.Arguments(0),true);
x.setRequestHeader('User-Agent','XMLHTTP/1.0');
x.send('');
while (x.readyState!=4) {WSH.Sleep(50)};

DOM.Write(x.responseText);

var scrape = DOM.getElementById('pair_today').getElementsByTagName('h1')[0].innerText;
WSH.Echo(scrape.match(/^.*=\s+(\S+).*$/)[0]);

Post a Comment for "Need To Parse Out String From Html Document In A Batch File"