战略布局图上每种产品都有一个价值曲线，发掘创造符合蓝海战略的价值曲线的过程，也叫做价值创新(Value Innovation)，create value for both producer and consumer in a totally new defined market, ignore competitors in existing market, focus on yourself and your product/service。
Managing large scale data center automatically without too much human involving is always a challenging task. Industrial giants such as Google and Microsoft are pioneer in this area and very little information is leaked about how they handling such problems. But in 2007, Michael Isard of Microsoft Research wrote a paper entitled Autopilot: Automatic Data Center Management which describes the technology that Windows Live and Live Search services have used to manage their server farms. This is a great opportunity to look at how industrial giant manage tens of thousands of machines using software.
- Fault tolerant, any component can fail at any time, the system must be reliable enough to continue automatically with some proportion of its computers powered down or misbehaving
- Simplicity, simplicity is as important as fault-tolerance when building a large-scale reliable, maintainable system. Avoid unnecessary optimization, and unnecessary generality.
A typical application‖ rack might contain 20 identical multi-core computers, each with 4 direct-attached hard drives. Also in the rack is a simple switch allowing the computers to communicate locally with other computers in the rack, and via a switch hierarchy with the rest of the data center.
Finally each computer has a management interface, either built in to the server design or accessed via a rack-mounted serial concentrator.
The set of computers managed by a single instance of Autopilot is called a cluster.
Autopilot consists of three sub systems
- Hardware management, including machine/switch/router state maintain, auto error repair, os provisioning etc.
- Deployment, automatically deploy application and data to specified machines in a data center.
- Monitoring, monitor the state of device and service inside the data center, collect performance counter and user friendly display UI.
- Main responsibility of Device Manager, it maintains a replicated state for each device in the data center
- It makes decision to reboot, re-image or retire a physical machine/switch/router
- It periodically discover new machine through the special management interface, either built in to the server design or accessed via a rack-mounted serial connector
- It automate the OS installation process through Provisioning Service
- It automate the error repair process using a Repair Service
- It collection device state from various Watchdog Service
- Machine is assigned to a machine function, which indicates what role it plays and what kind of services will run on it
- Machine is also assigned to a scale unit, which is a machine collection that serves as application/os update unit
- Each machine is responsible for running a list of application/autopilot service and this list is stored as service manifest file. Multiple version of manifest file can be stored in a machine, only one is active, others are kept for switch to active or rollback when upgrading failed
- Device manager maintains the manifest file list of each machine in the cluster and its corresponding active version
- Deployment service is a multi-node service which stores all the application/data files listed in the service manifest. These files are synced from external building system.
- Autopilot operator trigger new code deployment by a single command to Device Manager. DM then update service manifest of specified machines accordingly and kick each machine to start to sync bits from deployment service and run them. Machine in the cluster then sync the manifest file and download specified application/data to local disk and start them.
- In normal case, each machine periodically query DM what manifest should be on its local disk. It will fetch one from deployment service if needed manifest files are missing
- Watchdog, it constantly probe the status of other service/machine and report it back to device manager. Autopilot provides some system wide watchdog, but application developer can build their own ones as long as these service knows how to talk to DM about device status
- Performance counters are used to record the instantaneous state of components, for example a time-weighted average of the number of requests per second being processed by a particular server.
- The Collection Service forms a distributed collection and aggregation tree for performance counters. It can generate a centralized view of the current state of the cluster‘s performance counters with a latency of a few seconds.
- All collected information is stored in a center SQLServer for fast and complex querying by end user. These data is exposed to application developer and operator through a http based service called cockpit service.
- Besides global view of status of the data center, cockpit is also responsible for access some resources (for example, application/data/log files)
- Predefined status query and abnormal result are combined to form an alter service. It can send out email and even phone call when some critical situations happen.
- Wechat History
- On User
- On Requirement
- On Design
- On Interaction
Highlight points on User
- People is lazy, let them do/click less in order to reach some goal
- People likes fashion, do something really cool to attract them
- People lacks of patience, do not let them read manual or tip
- People’s time is fragmented, do not give them some task that needs lots of continues time
- People gets stupid when they are in mass, treat them somewhat stupid without too much judge
- People is emotional, they seeks for inner satisfaction, for the feeling of being
- People likes uncertainty, the has lots of curiosity to unknown stuff
- People is social animal, they want to know more people
- Know your user from psychological perspective
Highlight points on Requirement
- Product is designed to satisfy some desire that lives in people’s heart and daily life
- Satisfy user, don’t put too much moral judge in your product design
- Purify and abstract all the feedback got from end user, don’t just do what user tells you literally
- Try to get to know your target user from weibo/forum etc.
- Revolutionary product comes out when the society changed
- Different people usually has some common requirement, that’s the most important thing you need to work on
- Associate feature requirement with psychological desire, people is emotional
- Think in large scale and massive group for social product/feature
- Focus on few but vital scenarios, ignore other trivial stuff
- Polling/survey can only help you improving existing feature, can’t help you on new product/feature
- Feature requirement comes from solving problems form you and your friends
Highlight points on Design
- Evolve your product gradually, you can’t design a perfect product at the first hit, every product has it’s own life cycle
- Products that has clear DNA will survive longer time
- Design the product structure first, and then focus on detail
- Categorizing, make things clean and clear
- Loving abstraction, make things simple and easy
- Design from scenario, not feature list
- Be careful about over design
- Drop feature that won’t makes you and user exciting
- Responsiveness is the king of user experience
- Ship feature gradually, don’t move too fast, change too much in one step
- Give user the rights to choose, core + plugin
- Respect your user: protect their privacy, save their temporary input, broadcast message signed by real name, not “system administrator”
- One thing for all, not one version for one zone
- Design for user, user is the major role, not design itself
- Makes things as nature as possible, don’t makes people think
- Hide technology from common user
- Focus, less is more
Highlight points on Interaction
- UI serves for feature
- Makes it simple and clean
- Each screen has its own topic
- Hide numbers
Some Comments-Emphasized too much on the importance of product manager. Most of the time, whether a product will succeed (especially in China) depends on what product you are going to do and what’s the platform you can leverage. For social product, existing user data and connection is the most important thing.
- Product manager is not god. God determines everything, but product manager should design product as desired (explicitly or implicitly) by user.
- Too many critics on competitor’s product design, but those features are what I (as a normal user) think Wechat should add.
- Wechat is the most successful product in the market, but what’s the real reason? Because it’s different feature design? The only reason I think is that it’s backed by Tencent, which has large mount of QQ user and its binding to QQ friends.
- He said a lot about avoiding “over design”, but also talks about too much on active design.
- The presentation lacks of something called “无为而治”, whether a product will success or not, what the final running system will look like is not only determined by how product manager design it, but also by how people interact with it.
- He should also thanks to QQ user data, to the competitors, to the creator of kik, to the great mobile Internate time.