-
Notifications
You must be signed in to change notification settings - Fork 0
Ethical, Legal, and Security Considerations
samyuc edited this page Nov 4, 2019
·
1 revision
- Passwords and user info
- https://docs.djangoproject.com/en/2.2/topics/security/
- Django has many options for securing software, and it is important for our project in order to secure users’ username, password, and potentially personal or private texts. (See security section for further detail.)
- Refer to users by their IDs in our databases and systems so that we are not prying into who is uploading what information
- Uploaded text
- The text that users upload could be sensitive or personal, so we should also take measures to secure the database the text is stored in
Could the use of your software result in racial, gender, religious, or any other type of discrimination? How does your software try to mitigate this problem?
Ways in which this could happen:
- A text that is uploaded contains an actual discriminatory bias, and the algorithm simply confirms the perspective as an important aspect of the document. This is a valid use of our software, as it could lead to better understanding of an article written by a harmful person to further study hate speech in an academic manner.
- The algorithm favors and seems to (incorrectly) portray a discriminatory bias.
- Forceful misinterpretation by users of the software to feed a confirmation bias by someone with a racial, gender, religious, or any other type of discrimination.
- We, as a software, do not need to monitor this issue too closely, as their perspective is not our responsibility. However, if this perspective is shared as “scientific truth” with a public due to our algorithms being conveyed as entirely objective and not an algorithm with a statistical confidence interval of success rather than complete objectivity.
- Solution:
- Disclaimer on the effectiveness of our algorithm(s) with explanation somewhere in decent detail on the site and something along the lines of “User assumes all responsibility”
- This solution also applies to user assuming risks of confirmation bias if a biased text population is used, which could also be explained in a couple sentences wherever the above-mentioned solution appears.
- Solution:
- Bad data, bad results. If people use data that is not representative of a population (ie: text samples from only white men) and then extrapolate results to an entire population, this could harm people from minority communities
- Solution:
- Education and resources on how much the algorithms can actually tell us
- Disclaimer that results of the algorithms are 1) not representative of the views of the developers 2) dependent on what text was uploaded
- Solution:
Can your software by abused by some users to cause harm to other users? or to the public at large? How do you mitigate it?
- If we (as administrators of the website) see that someone is uploading text that violates our terms of use (which will disallow any illegal activity) then we will ban the user
- Solution from above solves the primary issue – flawed conclusions coming from an algorithm or algorithms that seem flawless and entirely objective rather than with a statistic of success and failure in its inherent design.
- We must explain this issue in detail and be open-source for entire transparency and openness to critique as updates to algorithms and techniques develop.
- We provide disclaimers explaining the methodology used
- We make our code open-source so that people can see exactly where the results are coming from
- We provide access to resources so that people know how to properly interpret results
- We make it clear that we are only providing a service based on preexisting mathematical algorithms and not providing any sort of opinion on the text provided
Is your software violating any licensing agreements? List all third-party software you plan to use and ensure that you have the right to use as you plan.
-
Django
- No licensing agreements will be violated. We are not using it for commercial use. Django is free and open source. We will not be redistributing source code in anyway and we will not be the name of Django to endorse or promote our product. Thus, the conditions stated (at https://www.djangoproject.com/ and https://github.com/django/django/blob/master/LICENSE) by Django are met.
-
Angular
- No licensing agreements will be violated. We are not using it for commercial use. We will not be redistributing any source code. We therefore meet all requirements to use Angular.
-
NLTK
- “You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications,” so long as recipients of the work are provided with a license
-
GenSim
- “This means that it’s free for both personal and commercial use, but if you make any modification to Gensim that you distribute to other people, you have to disclose the source code of these modifications.”
-
SciPy
- Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
-
TextBlob
- License states permission is granted to “use, copy, modify, merge, publish...” so long as license is included in any copies of the software
Are there any intellectual property constraints placed by your client? or by the owner of some dataset you need to use? List them.
No, there are no intellectual property constraints placed. We will put a disclaimer that states that we are not held liable if any of their uploaded information is stolen. This will be in our terms and conditions, releasing us from any legal responsibility.
Can your users use your app to break the law? post copyright works on your webapp? steal information? Etc.
- Users could technically store copyrighted material on our webapp. If a user is found to be doing this, their account will be removed. Users can not post anything that other users can see nor will they have interactions with other users to steal their information. Unless our system is hacked, users can not steal other users information, and when users use our website, they assume responsibility if the system is hacked.
We will be keeping users' emails, passwords, and their uploaded text. As stated in the ethical section, we will be working to protect it in several ways; encouraging users to choose strong passwords, keeping their passwords in the database hashed, taking measures to secure the database.
Identify possible attack vectors, that is, ways malicious users could try to use your software to escalate their privileges. This includes root access to your server, access to other user’s sensitive information (say via XSS attacks), root access to your database, etc. Explain protection plan.
We will be doing several things to prevent privilege escalation including:
- Keep critical information on the server side. We will only send session IDs to the user.
- Using a digital signature when necessary as data cannot be tampered with
- Encrypt the data sent to the user
- Use OWASP methods to prevent things like XSS attacks (https://github.com/OWASP/CheatSheetSeries/tree/master/cheatsheets)