Skip to content

Commit

Permalink
Initial commit of library, unit tests, and pom file.
Browse files Browse the repository at this point in the history
  • Loading branch information
Kevin-Sangeelee committed Feb 26, 2020
0 parents commit 14df58f
Show file tree
Hide file tree
Showing 9 changed files with 13,590 additions and 0 deletions.
12 changes: 12 additions & 0 deletions .classpath
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
<?xml version="1.0" encoding="UTF-8"?>
<classpath>
<classpathentry kind="src" path="src/main/java"/>
<classpathentry kind="src" path="src/test/java"/>
<classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER/org.eclipse.jdt.internal.debug.ui.launcher.StandardVMType/java-11-openjdk-amd64">
<attributes>
<attribute name="module" value="true"/>
</attributes>
</classpathentry>
<classpathentry kind="con" path="org.eclipse.jdt.junit.JUNIT_CONTAINER/5"/>
<classpathentry kind="output" path="bin"/>
</classpath>
17 changes: 17 additions & 0 deletions .project
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
<?xml version="1.0" encoding="UTF-8"?>
<projectDescription>
<name>PublicSuffixList</name>
<comment></comment>
<projects>
</projects>
<buildSpec>
<buildCommand>
<name>org.eclipse.jdt.core.javabuilder</name>
<arguments>
</arguments>
</buildCommand>
</buildSpec>
<natures>
<nature>org.eclipse.jdt.core.javanature</nature>
</natures>
</projectDescription>
12 changes: 12 additions & 0 deletions .settings/org.eclipse.jdt.core.prefs
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
eclipse.preferences.version=1
org.eclipse.jdt.core.compiler.codegen.inlineJsrBytecode=enabled
org.eclipse.jdt.core.compiler.codegen.methodParameters=do not generate
org.eclipse.jdt.core.compiler.codegen.targetPlatform=9
org.eclipse.jdt.core.compiler.codegen.unusedLocal=preserve
org.eclipse.jdt.core.compiler.compliance=9
org.eclipse.jdt.core.compiler.debug.lineNumber=generate
org.eclipse.jdt.core.compiler.debug.localVariable=generate
org.eclipse.jdt.core.compiler.debug.sourceFile=generate
org.eclipse.jdt.core.compiler.problem.assertIdentifier=error
org.eclipse.jdt.core.compiler.problem.enumIdentifier=error
org.eclipse.jdt.core.compiler.source=9
62 changes: 62 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
Public Suffix List Helper
-------------------------

##### Terminology used in this page:

* TLD - Top Level Domain
* SLD - Second Level Domain
* eTLD - effective Top Level Domain
* ccSLD - country-code Second Level Domain

The Public Suffix List is a register of domain suffixes that are used in
combination to provide effective Top Level Domains for certain actual Top
Level Domains.

For example, the TLD .uk has a number of second-level domains that are for most
purposes considered part of the TLD. Therefore, .co.uk, .ac.uk, .org.uk can all
be thought of as eTLDs, made up of a ccTLD and an actual TLD.

```
Host SLD ccSLD TLD
-------------------------------
www example com
www example co uk
```

Above, the eTLDs would be '.com' and '.co.uk'.

In the `net.susa.cfs.psl` package, I consider the part to the left of the eTLD to
be the SLD. Everything to the left of the SLD is taken as the host.

To achieve this, we load the Public Suffix List into a tree structure. The
first tree level contains the actual TLDs, each of which contains a list of
their ccSLDs, each of which can contain further ccSLDs, and so on.

```
root
|
-----------------------
| |
uk au
| |
---------- -------------------
| | | | |
co ac org edu gov
| |
------------- -------
| | | | |
vic wa tas vic wa
```

Using a tree allows us to store and search the list of ~9000 entries quickly
when processing large numbers of URLs.

The API to check a URL for is: -

String getETLD(String fqdn)

Returns the substring that should be considered the TLD. If the domain does not
match any entry from the Public Suffix List, then the first part of the domain
is returned. If the string cannot be parsed as a valid domain, the function
returns the empty string.

65 changes: 65 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">

<modelVersion>4.0.0</modelVersion>

<groupId>net.susa.cfs.psl</groupId>
<artifactId>PublicSuffixList</artifactId>

<packaging>jar</packaging>

<version>1.0-SNAPSHOT</version>

<name>PublicSuffixList</name>
<url>http://maven.apache.org</url>

<properties>
<maven.compiler.source>1.9</maven.compiler.source>
<maven.compiler.target>1.9</maven.compiler.target>
</properties>

<dependencies>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-api</artifactId>
<version>5.3.2</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-engine</artifactId>
<version>5.3.2</version>
<scope>test</scope>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.0</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.22.0</version>
<configuration>
<argLine>
--illegal-access=permit
</argLine>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-failsafe-plugin</artifactId>
<version>2.22.0</version>
<configuration>
<argLine>
--illegal-access=permit
</argLine>
</configuration>
</plugin>
</plugins>
</build>
</project>
Loading

0 comments on commit 14df58f

Please sign in to comment.