Posts

Donate

OpenWebKitSharp Document Text Similar To .NET WebBrowser In C#

Here's how you get the page source of a WebKitBrowser similar to DocumentText property of a .NET WebBrowser Control. JSValue val = webkitBrowser.GetScriptManager.EvaluateScript( "document.getElementsByTagName(\"html\")[0].outerHTML;" ); if (val != null ) { PageSource = val.ToString(); } Greg

How To Set Attribute Option Element In OpenWebkitSharp C#

Here's how you set option element attribute using OpenWebKitSharp. foreach (WebKit.DOMNode optionElement in nodeItem.ChildNodes) { if (searchInnerHtml && optionElement.TextContent.Equals(searchInnerString)) { WebKit.InteropIDOMHTMLElement el = (WebKit.Interop.IDOMHTMLElement)optionElement.GetWebKitObject(); el.setAttribute( "selected" , "selected" ); } }

Retrieving the COM class factory for component with CLSID {D6BCA079-F61C-4E1E-B453-32A0477D02E3} Openwebkitsharp3.0

If you encountered an error while running OpenWebkitSharp in your workstation or production server with the specific details below: Retrieving the COM class factory for component with CLSID {D6BCA079-F61C-4E1E-B453-32A0477D02E3} failed due to the following error: 800736b1 The application has failed to start because its side-by-side configuration is incorrect. Please see the application event log or use the command-line sxstrace.exe tool for more detail. (Exception from HRESULT: 0x800736B1). The solution is to download Microsoft Visual C++ 2005 Service Pack 1 Redistributable Package MFC Security Update here: Visual C++ 2005 Service Pack 1 Redistributable and install it in your server/workstation. Greg

GeckoFX How To Trigger Or Fire JavaScript _doPostBack() Method

Here's how you fire a _doPostBack() in ASP.NET using GeckoFX. //fire js method using (AutoJSContext context = new AutoJSContext( this .JSContext)) { string result; context.EvaluateScript( string .Format( "javascript:__doPostBack('{0}','')" , paramControlName), out result); } Cheers!

GeckoFX DocumentText Similar To Webbrowser Control (C#)

In a traditional webbrowser control, you can get the page source like this: string pageSource = webbrowser1.DocumentText; However, in GeckoFX webbrowser, there's no DocumentText property. A workaround is to get the InnerHtml property of html tag using the code below: string pageSource = string .empty; if (! string .IsNullOrEmpty( this .Document.GetElementsByTagName( "html" )[0].InnerHtml)) pageSource = this .Document.GetElementsByTagName( "html" )[0].InnerHtml; Cheers!

GeckoFx getElementByID() click() Method Missing

In an application that i am creating using Gecko FX version 15, I noticed that getElementsByTagName() has an invoke member click() as shown below: this .Document.GetElementsByTagName(geckoElement.TagName)[indexSearchElement].Click(); But missing in getElementByID(). After experimenting for a few hours, I came up with the solution below: ((GeckoHtmlElement) this .Document.GetElementById(elemId)).Click(); The trick was to cast it as GeckoHtmlElement. Greg

WebClient Slow In Crawling Or Web Scraping A Website In C#

Here's a tip i got from stack overflow on webclient slow on web crawling. 1 2 3 ServicePointManager.DefaultConnectionLimit = int .MaxValue; ServicePointManager.MaxServicePoints = int .MaxValue; ServicePointManager.MaxServicePointIdleTime = 0; I simply changed the default connection limit and max service points to a numeric value. Then, the crawling starts to speed up. Greg

WebRequest Url Not Returning Correct Page Source If Proxynull Not Used As Part Of Url Query String (Web Scraping) In C#

In one of the sites im crawling, I encountered a situation where a site needs a query string like proxynull = 90B69303-3A61-4482-AF0725FDA1DAE548 or appended into a url like this http://samplesite/bin/jobs_list.cfm?proxynull=90B69303-3A61-4482-AF0725FDA1DAE548 I wonder if i could just use the post data and use the url without the proxynull query string like this http://samplesite/bin/jobs_list.cfm to scrape the website. After series of experimentation, the solution is to set the webproxy of the webrequest object to default proxy similar to the code below: ((HttpWebRequest)webRequest).Proxy = WebRequest.DefaultWebProxy; in order to use the url(http://samplesite/bin/jobs_list.cfm) without proxynull.

Cannot find JavaScriptSerializer in .Net 4.0

These are the steps for using it in .NET 4.0 1. Create a new console application 2. Change the target to dot.net 4 instead of Client Profile 3. Add a reference to System.Web.Extensions (4.0) 4. Got access to JavaScriptSerializer in Program.cs now :-) Source: Cannot Find Javascript Serializer in .NET 4.0

Remove HTML Tags In An XML String Document Using Regular Expressions (REGEX) In C#

Here's a regex pattern that will match html tags that are present in an xml string document. Where xml, node1, node2, node3, node4, node5, node6 and node7 are xml tags. node1 could represent a valid xml tag name like employername or similar tag names. xmlStringData = Regex.Replace(xmlStringData, @"<((\??)|(/?)|(!))\s?\b(?! (\b(xml|node1||node2|node3|node4|node5|node6|node7)\b))\b[^>]*((/?))>" , " " , RegexOptions.IgnoreCase); Greg

Donate