Skip to content

Commit

Permalink
init
Browse files Browse the repository at this point in the history
  • Loading branch information
anjesh committed Jun 10, 2015
0 parents commit f88fea1
Show file tree
Hide file tree
Showing 15 changed files with 452 additions and 0 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.pyc
32 changes: 32 additions & 0 deletions PdfInfo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import subprocess

"""
ideas from https://gist.github.com/godber/7692812
"""

class PdfInfo:
def __init__(self, filepath):
self.filepath = filepath
self.output = {}
self.cmd = "pdfinfo"
self.process()

def process(self):
labels = ['Title', 'Author', 'Creator', 'Producer', 'CreationDate', \
'ModDate', 'Tagged', 'Pages', 'Encrypted', 'Page size', \
'File size', 'Optimized', 'PDF version']
cmdOutput = subprocess.check_output([self.cmd, self.filepath])
for line in cmdOutput.splitlines():
for label in labels:
if label in line:
self.output[label] = self.extract(line)

def extract(self, row):
return row.split(':', 1)[1].strip()

def getPages(self):
return int(self.output['Pages'])

def getFileSizeInBytes(self):
return int(self.output['File size'][:-5].strip())

32 changes: 32 additions & 0 deletions PdfProcessor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
from os import listdir
import os.path
from PdfInfo import *
from PdfToText import *

class PDFProcessor:
def __init__(self, filePath, outputDir):
self.filePath = filePath
self.outputDir = outputDir
self.totalPages = 0
self.structured = False
pass

def process(self):
pdfInfo = PdfInfo(self.filePath)
self.totalPages = pdfInfo.getPages()
pdfToText = PdfToText(self.filePath, self.totalPages, self.outputDir)
pdfToText.extractPages()

def smellExtractedPages(self):
for f in listdir(self.outputDir):
if os.path.isfile(os.path.join(self.outputDir, f)):
txt = open(os.path.join(self.outputDir, f))

def isStructure(self):
return True if self.structured else False

def __init__(self, filePath):
self.cmd = ''



25 changes: 25 additions & 0 deletions PdfToText.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
import subprocess
import os.path

"""
ideas from https://gist.github.com/godber/7692812
"""

class PdfToText:
def __init__(self, filepath, pages, outputDir):
self.filepath = filepath
self.pages = pages
self.output = {}
self.outputDir = outputDir
self.cmd = "pdftotext"

def extractPage(self, page):
outputFileName = os.path.join(self.outputDir, str(page) + ".txt")
cmdOutput = subprocess.call([self.cmd, "-f", str(page), "-l", str(page), self.filepath, outputFileName])

def extractPages(self):
for page in range(1, self.pages+1):
self.extractPage(page)



2 changes: 2 additions & 0 deletions runtest.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
python -m unittest tests.PdfInfoTest
python -m unittest tests.PdfToTextTest
17 changes: 17 additions & 0 deletions tests/PdfInfoTest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/usr/local/bin/python

import unittest
import sys

from PdfInfo import *

class PdfInfoTest(unittest.TestCase):
def setUp(self):
pass

def testPdfPages(self):
pdfInfo = PdfInfo('tests/sample.pdf')
pdfInfo.process()
self.assertEqual(pdfInfo.getPages(), 5)
self.assertEqual(pdfInfo.getFileSizeInBytes(), 81691)

22 changes: 22 additions & 0 deletions tests/PdfToTextTest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#!/usr/local/bin/python

import unittest
import sys

from PdfToText import *

class PdfToTextTest(unittest.TestCase):
def setUp(self):
pass

def testPdfPages(self):
pdfToText = PdfToText('tests/sample.pdf', 5, "tests/out")
pdfToText.extractPage(1)

def testPdfAllPages(self):
pdfToText = PdfToText('tests/sample.pdf', 5, "tests/out")
pdfToText.extractPages()

def testPdfPageScanned(self):
pdfToText = PdfToText('tests/sample-scanned.pdf', 5, "tests/out")
pdfToText.extractPage(2)
Empty file added tests/__init__.py
Empty file.
70 changes: 70 additions & 0 deletions tests/out/1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
Table of Contents

UNITED STATES
SECURITIES AND EXCHANGE COMMISSION
Washington, D.C. 20549

FORM 10-Q
(Mark One)

x QUARTERLY REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE
ACT OF 1934
For the quarterly period ended September 30, 2013

o TRANSITION REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE
ACT OF 1934
For the transition period from

to

Commission file number: 001-35167

Kosmos Energy Ltd.
(Exact name of registrant as specified in its charter)

Bermuda

98-0686001

(State or other jurisdiction of
incorporation or organization)

(I.R.S. Employer
Identification No.)

Clarendon House
2 Church Street
Hamilton, Bermuda
(Address of principal executive offices)

(Zip Code)

HM 11

Registrant’s telephone number, including area code: +1 441 295 5950

Not applicable
(Former name, former address and former fiscal year, if changed since last report)
Indicate by check mark whether the registrant: (1) has filed all reports required to be filed by Section 13 or 15(d) of the Securities Exchange Act of 1934
during the preceding 12 months (or for such shorter period that the registrant was required to file such reports), and (2) has been subject to such filing
requirements for the past 90 days. Yes x No o

Indicate by check mark whether the registrant has submitted electronically and posted on its corporate Web site, if any, every Interactive Data File
required to be submitted and posted pursuant to Rule 405 of Regulation S-T (§232.405 of this chapter) during the preceding 12 months (or for such shorter
period that the registrant was required to submit and post such files). Yes x No o
Indicate by check mark whether the registrant is a large accelerated filer, an accelerated filer, a non-accelerated filer, or a smaller reporting company. See
the definitions of “large accelerated filer,” “accelerated filer” and “smaller reporting company” in Rule 12b-2 of the Exchange Act.
Large accelerated filer x

Accelerated filer o

Non-accelerated filer o
(Do not check if a smaller reporting company)

Smaller reporting company o

Indicate by check mark whether the registrant is a shell company (as defined in Rule 12b-2 of the Exchange Act). Yes o No x

Indicate the number of shares outstanding of each of the issuer’s classes of common stock, as of the latest practicable date.


1 change: 1 addition & 0 deletions tests/out/2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

63 changes: 63 additions & 0 deletions tests/out/3.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
Table of Contents

TABLE OF CONTENTS

Unless otherwise stated in this report, references to “Kosmos,” “we,” “us” or “the company” refer to Kosmos Energy Ltd. and its
subsidiaries. We have provided definitions for some of the industry terms used in this report in the “Glossary and Selected Abbreviations” beginning
on page 3.
Page

PART I. FINANCIAL INFORMATION
Glossary and Select Abbreviations

3

Item 1. Financial Statements

Consolidated Balance Sheets as of September 30, 2013 and December 31, 2012
Consolidated Statements of Operations for the three and nine months ended September 30, 2013 and 2012
Consolidated Statements of Comprehensive Loss for the three and nine months ended September 30, 2013 and 2012
Consolidated Statements of Shareholders’ Equity for the nine months ended September 30, 2013
Consolidated Statements of Cash Flows for the nine months ended September 30, 2013 and 2012
Notes to Consolidated Financial Statements
Item 2. Management’s Discussion and Analysis of Financial Condition and Results of Operations
Item 3. Quantitative and Qualitative Disclosures about Market Risk
Item 4. Controls and Procedures

6
7
8
9
10

11
22
31
33

PART II. OTHER INFORMATION
Item 1. Legal Proceedings
Item 1A. Risk Factors

34
34
34
34
34
34

Item 2. Unregistered Sales of Equity Securities and Use of Proceeds
Item 3. Defaults Upon Senior Securities
Item 4. Mine Safety Disclosures
Item 5. Other Information
Item 6. Exhibits
Signatures
Index to Exhibits

35
36
37

2


91 changes: 91 additions & 0 deletions tests/out/4.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
Table of Contents

KOSMOS ENERGY LTD.
GLOSSARY AND SELECTED ABBREVIATIONS
The following are abbreviations and definitions of certain terms that may be used in this report. Unless listed below, all defined terms under Rule 410(a) of Regulation S-X shall have their statutorily prescribed meanings.

“2D seismic data”

Two-dimensional seismic data, serving as interpretive data that allows a view of a vertical cross-section beneath
a prospective area.

“3D seismic data”

Three-dimensional seismic data, serving as geophysical data that depicts the subsurface strata in three
dimensions. 3D seismic data typically provides a more detailed and accurate interpretation of the subsurface
strata than 2D seismic data.

“API”

A specific gravity scale, expressed in degrees, that denotes the relative density of various petroleum liquids. The
scale increases inversely with density. Thus lighter petroleum liquids will have a higher API than heavier ones.

“ASC”

Financial Accounting Standards Board Accounting Standards Codification.

“ASU”

Financial Accounting Standards Board Accounting Standards Update.

“Barrel” or “Bbl”

A standard measure of volume for petroleum corresponding to approximately 42 gallons at 60 degrees
Fahrenheit.

“BBbl”

Billion barrels of oil.

“BBoe”

Billion barrels of oil equivalent.

“Bcf”

Billion cubic feet.

“Boe”

Barrels of oil equivalent. Volumes of natural gas converted to barrels of oil using a conversion factor of 6,000
cubic feet of natural gas to one barrel of oil.

“Boepd”

Barrels of oil equivalent per day.

“Bopd”

Barrels of oil per day.

“Bwpd”

Barrels of water per day.

“Debt cover ratio”

The “debt cover ratio” is broadly defined, for each applicable calculation date, as the ratio of (x) total long-term
debt less cash and cash equivalents and restricted cash, to (y) the aggregate EBITDAX (see below) of the
Company for the previous twelve months.

“Developed acreage”

The number of acres that are allocated or assignable to productive wells or wells capable of production.

“Development”

The phase in which an oil or natural gas field is brought into production by drilling development wells and
installing appropriate production systems.

“Dry hole”

A well that has not encountered a hydrocarbon bearing reservoir expected to produce in commercial quantities.

“EBITDAX”

Net income (loss) plus (1) exploration expense, (2) depletion, depreciation and amortization expense, (3) equitybased compensation expense, (4) (gain) loss on commodity derivatives, (5) (gain) loss on sale of oil and gas
properties, (6) interest (income) expense, (7) income taxes, (8) loss on extinguishment of debt, (9) doubtful
accounts expense, and (10) similar items.
3


Loading

0 comments on commit f88fea1

Please sign in to comment.