Vocaloid

VOCALOID (often referred to as just V1 or VOCALOID1 in VOCALOID communities) is a singing synthesizer application software developed by the YAMAHA Corporation. The project was an international effort, and is considered the brainchild of Kenmochi Hideki, also known as the "father" of VOCALOID.

Elvis Project
In the 20th century, the most successful vocal synthesizing attempt had been "Queen of the Night" from Mozart's opera The Magic Flute; this had been made in 1984 by Yves Potard and Xavier Rodet using the CHANT synthesizer.

Jordi Bonada, a senior researcher at the Music Technology Group at Pompeu Fabra University in Barcelona joined the university in 1997. Bonada worked on a research project as requested by YAMAHA which contained some "interesting" ideas. Bonada was known to have set about recording not just a song from a singer, but various ranges and pitch in an attempt to build a model that any song could be built from. The project was codenamed "Elvis" and lasted two years. It did not become a product at the end of its development. This was due to the fact this particular project was too large due to being based on spectral morphing techniques and each song required a professional singer behind it.[1]

Daisy Project"
While it did not become a product, the "Elvis Project" helped establish that a series of phonetics in a wide range of pitches would help build a synthesizer based on any model. YAMAHA agreed to help them start a fresh new project; it was at this point that Kenmochi Hideki joined.[1] The first initial ideas came from him in Japan in 2000, with most of the research done at the Pompeu Fabra University and the development of the core signal processing libraries created in C++. YAMAHA itself was responsible for the product design and development of the actual product. It was pure collaborative research, and they did not think about selling at that time.

At the time, synthesizers would take days to produce good-quality results, but the vocal would always sound inhuman and obviously generated by a machine or computer. The price was expensive as well. This meant that while all other parts of the music production were by then fully able to be recreated in a DAW, producing a good-quality vocal performance meant hiring a human vocalist. So, the aim of the project was to provide a fast, low-cost way of getting uncanny human-like vocals to give producers full control of music production.[2] They used "Elvis" as the base model for ideas and set about to tackle two main problems:

How to process and transform singer recordings so that it would result in a performance of a given song sounding as natural as possible and provide the feeling of a continuous flow. How to process and transform the singer's recordings so that it would result in a performance of a given song sounding as natural as possible and provide the feeling of a continuous flow. The VOCALOID™ project was originally codenamed "Daisy Project" ("DAISYプロジェクト" or "でいじぃぷろじぇくと"), a name taken from the song "Daisy Bell" and was at a prototype stage in March 2002. (EpR [1]) was developed as the first voice model and it allowed the researchers to transform vocal timbres in a natural manner while preserving subtle detail.[1] At first, "Daisy" could only say vowels like "ai (love)". Four months later, "Daisy" began to support consonants, with the first "complete word" being "asa" (morning).

Because YAMAHA itself could only provide limited vocals, they licensed the software out to various 3rd party studios.[1] The first studio to join this project was Crypton Future Media, who were contacted in May 2002. YAMAHA then attempted to find English studios to support an English version, but the majority of responses to contact were negative. The first studio to enter development was Zero-G, joining in the fall of 2002, with PowerFX also joining that year. Thus, both English and Japanese voicebanks began development.

At the 6th anniversary of VOCALOID™, Hiroyuki Itoh noted that they received demos from Zero-G without warning of what seemed to be a male vocal singing. Since they came unexpectedly, they did not realize they were VOCALOID™ demos and thought they were some sort of prank.

"Daisy" was demonstrated at the 6th anniversary of VOCALOID™, where a file called "Fly Me to the Moon" was played, the file was originally created for 7/16/2002 when Crypton were shown the first demonstration in their Sapporo office. "Daisy" still had troubles with consonants at the time.

"Daisy" dropped as a name due to conflicts with copyrighting - despite attempts to change the name (such as translating it into Japanese), they ultimately could not register it.

The only 4 known vocals for "Daisy" were: LEON, LOLA, HANAKO, and TARO. LEON and LOLA were the only ones ever to be shown to the public, releasing as official voicebanks for the final VOCALOID software.

VOCALOID
Three examples of vocal library plug-in boxart are seen; this was YAMAHA's planned direction for the software

Kenmochi reported the name of the software was very hard at the time to decide and "VOCALOID" had fallen into 3rd place as a choice of name. The name "VOCALOID" was chosen 2 or 3 weeks before its announcement, after the 2nd choice name failed due to a copyright conflict with a software in Belgium, "VOCALOID" being a portmanteau of the words "Vocal" and "Android" ("vocal android"). Kenmochi chose to announce the technology on February 26, 2003, a day before his birthday.

The original design of VOCALOID™ was to act as a replacement singer for a real singer. Many reviewers at the time of LEON and LOLA's release thought that "VOCALOID" was a bold effort, as human speech was a complex thing to recreate. VOCALOID was regarded as the first of its kind to tackle singing vocals.

KAITO and MEIKO were originally recorded by YAMAHA themselves, before being made for commercial release. KAITO ended up being delayed a year and a half.

The first VOCALOIDs, LEON and LOLA, made their debut appearance and initial release at the NAMM Show on January 15, 2004. LEON and LOLA were then released in Japan by the studio Zero-G on March 3, 2004, both of which were sold as a "Virtual Soul Vocalist". They were also demonstrated at the Zero-G Limited booth during Wired Nextfest and won the 2005 Electronic Musician Editor's Choice Award.Zero-G later released MIRIAM, with her voice provided by Miriam Stockley, in July 2004. Later that year, Crypton Future Media, Inc. also handled the release of the first Japanese VOCALOID, MEIKO. It was during this time period between MIRIAM and MEIKO's respective releases that the first rival software Cantor was released and aimed to compete with VOCALOID, known only in the western hemisphere by LEON, LOLA, and MIRIAM.

Later Game Audio Network Guild held the "2nd Annual G.A.N.G. Awards Show" on Thursday, March 25, 2004 at the Fairmont Hotel in San Jose, California, during the Game Developer's Conference 2004. The software won the "Best New Audio Technology" award in Industry & Trade category.

Features
OCALOID has 5 voicebanks available (3 English, 2 Japanese), offering a limited range of voices. Other genres are possible to achieve by users with further voice editing. Both English and Japanese VOCALOID have an English interface. Other languages were planned for the future (though these would not be introduced until VOCALOID3).

According to the original YAMAHA VOCALOID website, the software's key features were its ability to recreate singing results exactly how you type them out on your PC. Manipulation of the vocals allowed for a greater array of styles and vocals than what was offered while having the added bonus of maintaining a degree of realism. VOCALOID drew its base for vocal based off analytic of the human voice and less from the samples of the human vocal. Extra expressions could be installed into a voice simply by adding vocal effects to further achieve results.

The file format for VOCALOID is "VOCALOID MIDI" (.MIDI); VOCALOID will not import .VSQ or .VSQX files, although it will import most MIDI file types.

The database of VOCALOID is much simpler and more difficult to modulate consonant sounds than the VOCALOID2 engine that followed.However, VOCALOID has some functions that VOCALOID2 does not have, such as the Resonance parameters. Resonance allowed the phonetic data to be manipulated through formant modulation, making it sound differently depending on what was done to it. The biggest advantage this offered was flexibility. As seen with voicebanks like LEON or MEIKO, each user can utilize the voicebanks very differently and VOCALOID has produced a wider range of different results with delicate editing by using several Resonances or other functions. All VOCALOID vocals are known to have had a small, be it undeclared, optimum vocal range compared to most vocals powered by later engine versions.

Unlike the version that followed, VOCALOID was a analytic based system that worked out how to adapt the vocal using mathematics. In short, this meant it used record data of samples to make the engine sound more like the vocalist behind the data, as a result the overtone of all 5 vocals was identical. The vocals sounded very synthetic and LQ, yet this is also why the engine was able to have such great flexibility as opposed to the sample-based versions that followed VOCALOID. The quality issue limited the feasibility of vocals being released for it, and Sweet ANN and BIG AL were not released for this version of VOCALOID for this reason. Also while realism was not beyond it, the analytic-based results did not produce as realistic results as the sample-based system.

When DSE.dll or DSE1_1.dll is examined by hex editor software, a number of listed phonetics were stated by the engine as possible sounds; however, no released VOCALOID used them.

The VOCALOID interface also had minor adjustments depending on what VOCALOID was used to open the engine with. For example, MIRIAM's interface recoloured the keyboard around the keys deep blue with Zero-G's logos on the interface, while KAITO's was green with Crypton Future Media logos. The standard that was used in VOCALOID demos and presentations was brown with no logos whatsoever.

Cultural impact
In comparison to its successor VOCALOID2, VOCALOID had very little cultural impact at its time of release. Sales of the software were very sluggish. It is difficult to know how many songs and albums are using the VOCALOID software since song writers must ask permission before being allowed to state specifically they are using a VOCALOID in their songs. Due to the lack of attention, the result is also a lack of knowledge and additionally a lack of coverage on how widespread usage of the software was.

The first album to be released using a VOCALOID was A Place in the Sun, which used LEON's voice for the vocals singing in both Russian and English.MIRIAM has also been featured in two albums, Light + Shadeand Continua.Japanese electropop-artist Susumu Hirasawa used VOCALOID L♀LA in the original soundtrack of Paprika by Satoshi Kon.

The majority of songs wherein the software was used as the main singer did not exist until after 2008 when KAITO was rediscovered. Because of how popular it was to feature entire songs with Hatsune Miku or the Kagamines release as the main singer, producers began to do the same with the older software. VOCALOID was mostly only useful for loops creation, as seen in "Paprika", since the software wasn't good enough to be a full replacement singer. Adding to the lack of major focus was that, due to its lack of coverage, there were not many techniques known to make it sound better. In adition the majority of producers who used the software came post 2008. In addition due to there being no fan culture during the era, there were "users" of the software but no "fans" to create a "fandom" in terms of both the English and Japanese version at the time. VOCALOID was treated as any other DTM plug-in or software application, causing it to fail to be acknowledged out of DTM and EDM circles until 2008.

The CEO of Crypton Future Media, Inc. noted the lack of interest in the initial VOCALOID software. Many studios when approached by Crypton Future Media for recommendations had no interest in the software initially, with one particular company representative calling it a "toy". Crypton blamed a fear of robots on part of the lack of response on the sale of the software. A level of failure was also put on LEON and LOLA for lack of sales in America, putting the blame on their British accents,despite initial praises overall from reviewers of the software, and the fact that the English version software had sold well in both Japan and Europe.

Earlier VOCALOIDs were created without "avatars", and boxart was not important to the function of the program. While MEIKO and KAITO had images that could later be used as avatars, LEON, LOLA and MIRIAM (although there is a clear image of a person) did not. When avatars became common with Japanese VOCALOIDs during the VOCALOID2 era, the English VOCALOIDs without official avatars were left to interpretation by fan artwork. Zero-G did show interest in revising the boxart of their VOCALOIDs since interest in VOCALOIDs had greatly increased, but the voicebanks were retired before this occured.

Hatsune Miku
Hatsune Miku (初音ミク), codenamed CV01, was the first Japanese VOCALOID to be both developed and distributed by Crypton Future Media, Inc.. She was initially released in August 2007 for the VOCALOID2 engine and was the first member of the Character Vocal Series. She was the seventh VOCALOID overall, as well as the second VOCALOID2 vocal released to be released for the engine. Her voice is provided by the Japanese voice actress Saki Fujita (藤田咲, Fujita Saki).

There have since been numerous installments, such as additional voice libraries dubbed 'Append', as well as an upgrade for the VOCALOID3 engine, which contained an English vocal release. She received a VOCALOID4 update to her Japanese and English voicebanks in August 2016, as well as a Mandarin Chinese voicebank in September 2017.

On August 31, 2019, Miku received her first voicebank outside of VOCALOID, that being Piapro Studio with her NT release. While Crypton is focusing on their own program, they are still in collaboration with YAMAHA and will continue to sell VOCALOID products in parallel to the Piapro Studio editions.

Gumi
GUMI (グミ) is a Japanese VOCALOID developed and distributed by Internet Co., Ltd. as Megpoid (メグッポイド) which was initially released in June 2009 for the VOCALOID2 engine. There have since been three installments developed for the VOCALOID3 engine: an update of the VOCALOID2 voice bank, additional Japanese voice libraries serving as an expansion pack, and a new bank dedicated to the English language. Her Japanese voicebanks were updated to the VOCALOID4 engine in November 2015, while for the VOCALOID6 engine, she released as a bilingual Japanese and English AI vocal in October 2022.Her voice is provided by Filipino-Japanese singer and voice actress Megumi Nakajima (中島愛; Nakajima Megumi).

In January 2014, GUMI received a speech voicebank known as Megpoid Talk. She later received a speech voicebank for A.I.VOICE in September 2022.

Luo Tianyi
Luo Tianyi (洛天依) is a Chinese VOCALOID formerly developed by Bplats, Inc. under the YAMAHA Corporation, and was created in collaboration with Shanghai HENIAN. She was the grand winning entry of the "VOCALOID™ CHINA" contest which was held for choosing the design of the first Chinese VOCALOID. Tianyi was released in July 2012 for the VOCALOID3 engine.

She was updated to the VOCALOID4 engine in December 2017 and was developed and distributed by Shanghai HENIAN. She also received Japanese voicebanks for the VOCALOID4 engine in May 2018and an update to the VOCALOID5 engine in February 2023. On May 5, 2022, Tianyi was confirmed to be receiving AI voicebanks for ACE Virtual Singer and X Studio.Currently, her AI voicebank for ACE Virtual Singer is currently open for public beta.

She is voiced by two different people. Shan Xin (山新 / 王宥霁 Wáng Yòujì), a professional Chinese voice actress, provided the voice for all current Luo Tianyi releases. In April 2018, it was confirmed that Shan Xin and Kano, a Japanese Utaite, collaborated to provide the samples for Luo Tianyi V4 Japanese.