保护 Java 源代码不被访问

上周,我不得不为家庭作业创建一个小图形用户界面。 我的同学都没做过。他们从我们上传的地方偷了我的,然后他们再次上传。当我告诉老师这都是我的功劳时,他不相信我。

所以我想把一个没用的方法或者什么东西放进去并证明我编码了它。我想到了加密。到目前为止我最好的主意是:

String key = ("ZGV2ZWxvcGVkIGJ5IFdhckdvZE5U"); //My proof in base64

你能想到其他更好的办法吗?

12772 次浏览

If you are giving source code to the teacher, then simply add a serialVersionUID to one of your class files that is an encrypted version of your name. You can decrypt it to the teacher yourself.

That does not mean anything to the others, just for you. You can say it's a generated code, if they're stealing it, probably won't bother to modify it at all.

If you want to do it in a stylish way, you could use this trick, if you find the random seed that produces your name. :) That would be your number then, and wherever it appears that would prove that it was you who made that code.

If your classmates stole your code from the upload site, I would encrypt your homework and email the key to the teacher. You can do this with PGP if you want to be complicated, or something as simple as a Zip file with a password.

EDIT: PGP would allow you to encrypt/sign without revealing your key, but you can't beat the shear simplicity of a Zip file with a password, so just pick a new key every homework assignment. Beauty in simplicity :)

What was stolen ?

  • The source ? You can put random Strings in it (but it can be changed). You can also try to add a special behavior know only from you (a special keypress will change a color row), you can then ask to the teacher "the others know this special combo ?". Best way will be to crash the program if a empty useless file is not present in the archive after 5 minutes of activity, your school mates will be too lazy to wait this ammount of time.

  • The binary ? Just comparing the checksum of each .class will be enough (your school mates are too lazy to rewrite the class files)

Use a distributed (=standalone) version control system, like git. Might be useful too.

A version history with your name, and dates might be sufficiently convincing.

This happened with a pair of my students who lived in the same apartment. One stole the source code from a disk left in a desk drawer.

The thief slightly modified the stolen source, so that it wouldn't be obvious. I noticed the similarity of the code anyway, and examined the source in an editor. Some of the lines had extra spaces at the ends. Each student's source had the same number of extra spaces.

You could exploit this to encode information without making it visible. You could encode your initials or your student ID at the ends of some lines, with spaces.

A thief will likely make cosmetic changes to the visible code, but may miss the non-visible characters.

EDIT:

Thinking about this a little more, you could use spaces and tabs as Morse-code dits and dahs, and put your name at the end of multiple lines. A thief could remove, reorder or retype some lines without destroying your identification.

EDIT 2:

"Whitespace steganography" is the term for concealing messages in whitespace. Googling it reveals this open-source implementation dating back to the '90s, using Huffman encoding instead of Morse code.

I had the same problem as you a long time ago. We had Windows 2000 machines and uploaded files to a Novel network folder that everyone could see. I used several tricks to beat even the best thieves: whitespace watermarking; metadata watermarking; unusual characters; trusted timestamping; modus operandi. Here's them in order.

Whitespace watermarking:

This is my original contribution to watermarking. I needed an invisible watermark that worked in text files. The trick I came up with was to put in a specific pattern of whitespace between programming statements (or paragraphs). The file looked the same to them: some programming statements and line breaks. Selecting the text carefully would show the whitespace. Each empty line would contain a certain number of spaces that's obviously not random or accidental. (eg 17) In practice, this method did the work for me because they couldn't figure out what I was embedding in the documents.

Metadata watermarking

This is where you change the file's metadata to contain information. You can embed your name, a hash, etc. in unseen parts of a file, especially EXE's. In NT days, Alternate Data Streams were popular.

Unusual characters

I'll throw this one in just for kicks. An old IRC impersonation trick was to make a name with letters that look similar to another person's name. You can use this in watermarking. The Character Map in Windows will give you many unusual characters that look similar to, but aren't, a letter or number you might use in your source code. These showing up in a specific spot in someone else's work can't be accidental.

Trusted Timestamping

In a nutshell, you send a file (or its hash) to a third party who then appends a timestamp to it and signs it with a private key. Anyone wanting proof of when you created a document can go to the trusted third party, often a website, to verify your proof of creation time. These have been used in court cases for intellectual property disputes so they are a very strong form of evidence. They're the standard way to accomplish the proof you're seeking. (I included the others first b/c they're easy, they're more fun and will probably work.)

This Wikipedia article might help your instructor understand your evidence and the external links section has many providers, including free ones. I'd run test files through free ones for a few days before using them for something important.

Modus operandi

So, you did something and you now have proof right? No, the students can still say you stole the idea from them or some other nonsense. My fix for this was to, in private, establish one or more of my methods with my instructor. I tell the instructor to look for the whitespace, look for certain symbols, etc. but to never tell the others what the watermark was. If the instructor will agree to keep your simple techniques secret, they will probably continue to work fine. If not, there's always trusted timestamping. ;)

It seems like an IT administration problem to me. Each student should have there own upload area which cannot accessed by other students.

The teacher would be a higher level up, being able to access each student upload folder. If this is not possible go with @exabrial answer as that is the simpliest solution.

The best thing you can do is to just zip the source code with a password and e-mail the password to the teacher.

Problem solved.

In my case, my teachers came with a better approach. The questions they provided has something to do with our registration number. Ex:

Input to a function/theory is our Registration number, which is different for each student

So, answers or the approach to the solution are relatively different from each student.This make the necessarily of all students has to do their homework on their own, or at-least get to know how to hack the approach with their own registration[it may be harder than learning the lession ;)].

Hope your lecturer will read this thread before his next tutorial :D

Just post your solution at the last minute. This won't give time to anyone to copy it.

And send a feedback to the administrator to disallow students to see other students assignments.

If you upload the file in a .zip with password encryption, anyone can just crack the password by downloading the .zip file and have their cpu run a million queries at it if they are that big of a cheat thief. Unfortunately, some are and it's easy to do.

Your source can be viewed on the shared server by the other students. The teacher should really be giving you your own password encrypted directory to upload to. This could be done easily just by adding subdomains. But perhaps the teacher might allow you to upload the files to your own server for him to access them there.

It's also possible to obfuscate the script so that it has a document.write('This page was written by xxxxx'), forcing anyone who copies your work to not be able to remove the credit unless they first decrypt it. But the real answer is that your school needs to give each of its students their own password protected directories.