Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue python-boilerpipe on docker #57

Open
lraghib opened this issue Jun 22, 2020 · 1 comment
Open

Issue python-boilerpipe on docker #57

lraghib opened this issue Jun 22, 2020 · 1 comment

Comments

@lraghib
Copy link

lraghib commented Jun 22, 2020

I trying to use python-boilerpipe in docker, but the problem is the code block in the line

extractor = Extractor(extractor='ArticleExtractor',url=link, headers=self.headers)

without returning nothing, knowing that with out docker it work fine

my dockerfile looks like:

FROM tiangolo/uwsgi-nginx-flask:python3.6

RUN pip3 install --upgrade pip
# copy over our requirements.txt file
COPY requirements.txt /tmp/
WORKDIR /tmp/

# Install OpenJDK-11
# Install "software-properties-common" (for the "add-apt-repository")
RUN apt-get update
RUN apt-get install -y software-properties-common 
RUN add-apt-repository ppa:openjdk/ppa
RUN apt-get install -y openjdk-11-jdk && \
    apt-get install -y ant && \
    apt-get clean;

# Fix certificate issues
RUN apt-get install ca-certificates-java && \
    apt-get clean && \
    update-ca-certificates -f;

# Setup JAVA_HOME -- useful for docker commandline
ENV JAVA_HOME /usr/lib/jvm/java-11-openjdk-amd64/
RUN export JAVA_HOME
RUN export PATH=$PATH:/usr/lib/jvm/java-11-openjdk-amd64/bin
#Check java
RUN echo $JAVA_HOME

# boilerpipe
RUN git --version
RUN git config --global http.sslverify false
RUN git clone https://github.com/misja/python-boilerpipe.git
WORKDIR /tmp/python-boilerpipe/
RUN pip3 install -r requirements.txt
RUN python3 setup.py install



RUN pip3 install -r /tmp/requirements.txt


# copy over our app code
WORKDIR /app
COPY ./app /app


Expose 80/tcp

after some debugging i found that the line that cause that is

self.source = BoilerpipeSAXInput(InputSource(reader)).getTextDocument()

any idea how to solve this problem ?

@tuxdna
Copy link
Collaborator

tuxdna commented Aug 14, 2020

What error do you observe ? Were you able to solve this issue? Do you have a small reproducible setup to investigate this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants