Using adapter pattern to parse HTML with C# and AgilityPack

Recently I faced with a business requirement about extracting informations from some html pages and display them in a local application.

The principal problem that I found was that the result were in html format and I needed to transform that in a c# object, in order to be able to manage the informations in my application.

So I thought that an adapter was very good for this purpose and I started an implementation of this pattern in c#.

Adapter Pattern

We can find a thons of documentation about the adapter pattern in c# so I don’t want to annoy with concepts that we already knows; I only share an image from msdn site:


By starting from different sources like html pages, I need an adapter that give me an object that I know; I will have html pages with different structures, different adapters but they should give me the same object.

Now that I have fixed these concepts, I can proceed with the contracts definitions.


I define the property of the common object with a specific interface:

interface ICompany
string Name { get; }
string VatNumber { get; }
string Email { get; }
string TaxPayerCode { get; }

These are the properties that my adapter will have to filled up after the html parsing. The second contract that I need is the adapter contract:

interface ICompanyAdapter
Task FindAsync(string key);

Now I can implement my adapter that deal with the parsing of the html code; Html Agility Pack help me in this work:

public class CompanyAdapter : ICompany, ICompanyAdapter
private const string Uri = "http://urladdress";
private HttpClient _httpClient;
private List<HtmlNode> _nodes = new List<HtmlNode>();

public string Name => ExtractData(_nodes, @"company");
public string VatNumber => ExtractData(_nodes, @"vat number");
public string Email => ExtractData(_nodes, @"email");
public string TaxPayerCode => ExtractData(_nodes, @"tax payer code");

public async Task FindAsync(string key)
_httpClient = new HttpClient();
var html = await Load(Uri + "/searchCompanies", key);
var doc = new HtmlDocument();
await Parse(doc);

private async Task Parse(HtmlDocument doc)
var body = doc.DocumentNode.SelectSingleNode("//body");
_nodes = body.Descendants("div");

private string ExtractData(List<HtmlNode> nodes, string tag)
foreach (var node in nodes)
var p = node.Descendants("p").ToList();

//some custom logic to extract data

return "";


The class implements the two interfaces defined above, parse the html code and store the nodes in a private field.

Then the public properties of the company interfaces leverage a method ExtractData to lazy retrieve the information from the list of nodes.

Merge results

I can have many of these adapters to call in a service, and in this phase every adapter will return the data as defined in the contracts.

So I need a merge strategy to union the results of the adapters in a single object.

I have a Company class that implements the ICompany interface and a method Merge that deal with this work:

public class Company : ICompany
public string Name { get; }
public string VatNumber { get; }
public string Email { get; }
public string TaxPayerCode { get; }

public void Merge(ICompany company)
Name = string.IsNullOrEmpty(Name) ? company.Name : Name;
VatNumber = string.IsNullOrEmpty(VatNumber) ? company.VatNumber : VatNumber;
Email = string.IsNullOrEmpty(Email) ? company.Email : Email;
TaxPayerCode = string.IsNullOrEmpty(TaxPayerCode) ? company.TaxPayerCode : TaxPayerCode;

Now the last step is invoke the adapters in the service:

public class CompanyService
public async Task<Company> FindAsync(string key)
var adapter1 = new CompanyAdapter1();
var adapter2 = new CompanyAdapter2();
await adapter1.FindAsync(key);
await adapter2.FindAsync(key);

var company = new Company();

return company;

I’m able to call all the adapters that I want and merge the results with a specific strategy.

This was possible with the adoption of common contracts for the adapters that returns result knows by the caller.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s