Amazon's AWS is 'Retiring' Its Open-Source-and-on-GitHub Documentation 21
Long-time Slashdot reader theodp writes: On the AWS News Blog, AWS Chief Evangelist Jeff Barr has published a kind of obituary for AWS Documentation on GitHub (RIP, 2018-2023). From the blog post:
"About five years ago I announced that AWS Documentation is Now Open Source and on GitHub. After a prolonged period of experimentation we will archive most of the repos starting the week of June 5th, and will devote all of our resources to directly improving the AWS documentation and website."
"The primary source for most of the AWS documentation is on internal systems that we had to manually sync with the GitHub repos. Despite the best efforts of our documentation team, keeping the public repos in sync with our internal ones has proven to be very difficult and time consuming, with several manual steps and some parallel editing. With 262 separate repos and thousands of feature launches every year, the overhead was very high and actually consumed precious time that could have been put to use in ways that more directly improved the quality of the documentation."
"Our intent was to increase value to our customers through openness and collaboration, but we learned through customer feedback that this wasn't necessarily the case. After carefully considering many options we decided to retire the repos and to invest all of our resources in making the content better."
"About five years ago I announced that AWS Documentation is Now Open Source and on GitHub. After a prolonged period of experimentation we will archive most of the repos starting the week of June 5th, and will devote all of our resources to directly improving the AWS documentation and website."
"The primary source for most of the AWS documentation is on internal systems that we had to manually sync with the GitHub repos. Despite the best efforts of our documentation team, keeping the public repos in sync with our internal ones has proven to be very difficult and time consuming, with several manual steps and some parallel editing. With 262 separate repos and thousands of feature launches every year, the overhead was very high and actually consumed precious time that could have been put to use in ways that more directly improved the quality of the documentation."
"Our intent was to increase value to our customers through openness and collaboration, but we learned through customer feedback that this wasn't necessarily the case. After carefully considering many options we decided to retire the repos and to invest all of our resources in making the content better."
Github doesn't have rsync? (Score:1)
I don't use github but I'd be surprised if it doesn't have some kind of functionality similar to rsync available.
New version of chapter 3 is ready? Hit the (r)sync button. Done.
Re: (Score:2)
Then it's not really git anymore.
The problems as I see it come from making the internal systems primary at all. The internal repositories should be forks of the public one, otherwise you're pissing upwind. Microsoft is a competitor though, so it wasn't really an option.
Amazon could have just set up their own public gitlab as the primary source though.
Re: (Score:3)
"The problems as I see it come from making the internal systems primary at all."
Exactly, while a nice notion this was always doomed to fail because they retained the internal documentation systems when they did it. They should have internally worked off the public documentation. They'd still want the ability to keep internal notes but there are lots of ways to do that. For instance they could have just used an app for loading and displaying the github information that had a few extra abilities to maintain m
Whaaa? (Score:4, Insightful)
"The primary source for most of the AWS documentation is on internal systems that we had to manually sync with the GitHub repos.
This sounds so blindingly obvious to me that I must be missing something here. Why isn't their documentation straight text, with a higher level for desired language and formatting? Come on, it's AWS. You can't tell me that multi-language programming is a foreign (pardon the pun) concept.
Coding docs not blog posts (Score:1)
Why isn't their documentation straight text, with a higher level for desired language and formatting?
It's really not practical, because you often end up having to include images in some way, also special formatting indicates for code, or for tables or other things.... also lots and lots of cross-reference links.
Basically stuff that Markdown is not nearly sufficient for, you end up having to have specialized formatting.
Re: (Score:2)
Thousands and thousands of books published in the last few decades using either LaTeX or DocBook. Images and very specific typesetting style are quite possible using a straight text format that Git and diff can easily consume.
Having worked at Amazon, I suspect the answer is some teams didn't buy into the concept and forced silly compromises. And those compromises were not properly staffed or were not scalable (gasp!) so it didn't work. Lots of corporate mental illness dooms any attempts at wiping out techni
Re: (Score:2)
A good way to wipe out technical debt is to remove redundancies. Makes sense to me.
Besides, they fundamentally aren't publishing "books", and probably shouldn't. The book is a holdover from days when things didn't move so quickly. Also, it's a terrible idea because people will save that book offline and expect it to be accurate, when it's almost certainly out of date a couple of days after release.
Re: (Score:2)
I publish slide decks on the web all done in LaTeX. You don't necessarily have to make a book when you use these tools. They're actually macro systems and you can work out a template for just about any kind of publication where you want text and images that have some relative layout on a screen or sheet of paper.
Re: (Score:2)
This sounds so blindingly obvious to me that I must be missing something here. Why isn't their documentation straight text, with a higher level for desired language and formatting?
I've worked on documentation for languages (C#, VB, Hack). If documentation is straight text you're probably doing it wrong. You want a build pipeline so that every code sample in the docs get automatically built as part of your continuous integration. You want an index.
Re: (Score:1)
This is what LaTeX was made to do.
You're doing it wrong. (Score:5, Insightful)
Re: (Score:3)
Agreed, even if a lot of Azure documentation is outdated or looks like stuff autogenerated from code.
That doesn't shock me (Score:5, Informative)
We recently used the AWS GitHub repo to find a fix to a bug in an AWS Log4J patch that AWS support wouldn't fess up to.
Rather than admit that their code was leaving a bunch of garbage temp files behind, they just quietly fixed the bug without notifying anyone and then tried to blame us for using the older version. Grr.
Horseshit (Score:2)
"The primary source for most of the AWS documentation is on internal systems that we had to manually sync with the GitHub repos."
The only thing this tells me is that the entire AWS team is too incompetent to figure out how to figure out how to make cron run a script every night.
If that is their actual excuse as a business then some of you might want to be talking to your managers about moving away from AWS because they clearly cannot be trusted for the most basic of tasks, much less ensuring server security.
Re: (Score:3)
Re: (Score:2)
Isn't that what git is good at?
Re: (Score:2)
Re:Horseshit (Score:4, Insightful)
Merges are more complicated than git (or any source code control system) can do automatically.
Let's say user A, modifies line 123 of a file on one system. User B modifies the same line, but with different text. There is no way to automatically decide which one should be kept (and no, modification date isn't a good criterion - it is possible that one is better than the other).
Even a merge that "succeeds" without conflict might still result in incorrect documentation. User A modifies line 123, user B modifies line 124 in a different way - for simplicity let's say each added the equivalent of "not" to the the text - now you have two "nots" when you really should have one.
So the bottom line, you need human intervention to validate merges (this is can be true of code as well... blindly merging can cause bugs of the same sort). I can easily see that doing this at any large scale could rapidly be a resource sink. You need one master system, not two, regardless of the source code control used. Amazon chose their internal system rather than the public git for that master (I wouldn't presume to judge if this is a "good" decision or not).
Re: (Score:2)
maintain two parallel version control systems
Synchronizing is exactly what git is designed for.
There's gotta be a better way! (Score:2)
If only there were some tool they could use to push their changes to Github.