Welcome Guest ( Log In | Register )

4 Pages < 1 2 3 4 >Bottom

Outline · [ Standard ] · Linear+

 Bursa Closed for First Session, From Bursa Malaysia

views
     
djspinnet
post Jul 4 2008, 02:37 PM

On my way
****
Senior Member
532 posts

Joined: Aug 2007


QUOTE(cherroy @ Jul 4 2008, 02:07 PM)
In Malaysia, it won't. Malaysia doesn't have this kind of culture.

A single hardisk problem can cause whole core trading system jam up, no redundancy or backup plan?  shocking.gif
*
Better yet, saw this in the Star biz section print version (not found in the online version)

I'll try to recap as much as I can the chrnology of events (newspaper in me boss office at the moment but you can pick up the Star print version and check it out yourself)

6.00 am the hard disk died, hard disk replaced
6.30 am replacement hard disk causing problems, and triggering failure in another hard disk and the CPU
8.30 am activate the backup system
1.00 pm backup system start up took too long, switch back to production site to reattempt.
from then onwards, connection problems bla bla bla

Question time. Hard disk problematic causing failure in another hard disk and the CPU??

Second question. Only one failover system? That's it?
fyire
post Jul 4 2008, 02:54 PM

Look at all my stars!!
Group Icon
VIP
9,270 posts

Joined: Jan 2003
From: Somewhere out there
QUOTE(djspinnet @ Jul 4 2008, 02:37 PM)
Better yet, saw this in the Star biz section print version (not found in the online version)

I'll try to recap as much as I can the chrnology of events (newspaper in me boss office at the moment but you can pick up the Star print version and check it out yourself)

6.00 am the hard disk died, hard disk replaced
6.30 am replacement hard disk causing problems, and triggering failure in another hard disk and the CPU
8.30 am activate the backup system
1.00 pm backup system start up took too long, switch back to production site to reattempt.
from then onwards, connection problems bla bla bla

Question time. Hard disk problematic causing failure in another hard disk and the CPU??

Second question. Only one failover system? That's it?
*
The other interesting thing to keep in mind here is that the explanation given focuses so much on the failure of a single harddisk. So what happened to the RAID array? Unless the attempt by the RAID controller to rebuild the data on the new disk went haywire as well.

The other thing of note also is the need to activate the backup systems manually, especially considering that such real time systems would have automated failovers.
hpteh
post Jul 4 2008, 03:22 PM

Getting Started
**
Junior Member
147 posts

Joined: Jul 2006
For Christ sake we're talking about stock exchange system here, either they're really dumb which we should be proud of ourselves as the IT guys at SE not even half of our knowledge on how to implement a proper real time online system or they're stupid enough to believe that they would get away with this kind of excuse... Malaysia really whatever also boleh... doh.gif

This post has been edited by hpteh: Jul 5 2008, 09:22 AM
shoduken
post Jul 4 2008, 03:51 PM

Regular
******
Senior Member
1,741 posts

Joined: Mar 2008
While the market is dropping, let's discuss which stock is worth buying :-D
dragony
post Jul 4 2008, 04:19 PM

Getting Started
**
Junior Member
167 posts

Joined: Feb 2006
From: KL, PJ


I SUGGEST RESORT & BJTOTO.....
robertngo
post Jul 4 2008, 05:25 PM

Look at all my stars!!
*******
Senior Member
4,027 posts

Joined: Oct 2004


QUOTE(fyire @ Jul 4 2008, 02:54 PM)
The other interesting thing to keep in mind here is that the explanation given focuses so much on the failure of a single harddisk. So what happened to the RAID array? Unless the attempt by the RAID controller to rebuild the data on the new disk went haywire as well.

The other thing of note also is the need to activate the backup systems manually, especially considering that such real time systems would have automated failovers.
*
this is not a entry level Dell Server, the trading system are running on mainframe class hardware, i find it really hard to believe a single failed harddisk cause this problem, even if some of the disk and CPU is down it can still keep on running, also BNM have requirement all financial company have DR site that can take over within a few minute, this is not available here, the DR site is not working properly, does BURSA have the same requirement as bank and financial institution to perform regular DR trial and report the result to BNM, if so there maybe someone that lie in the report. If there is no requirement for Bursa to have have hot standby DR system i will be shock, such a important part of our country economy is at risk.

here is official timeline

user posted image

http://biz.thestar.com.my/news/story.asp?f...38&sec=business

CIO said they are using HP NonStop server, this is fault tolerate server, all the part in the system is have redundancy there is no single point of failure.

http://en.wikipedia.org/wiki/NonStop

This post has been edited by robertngo: Jul 4 2008, 05:39 PM
dreamer101
post Jul 4 2008, 08:49 PM

10k Club
Group Icon
Elite
15,855 posts

Joined: Jan 2003
QUOTE(robertngo @ Jul 4 2008, 05:25 PM)
this is not a entry level Dell Server, the trading system are running on mainframe class hardware, i find it really hard to believe a single failed harddisk cause this problem, even if some of the disk and CPU is down it can still keep on running, also BNM have requirement all financial company have DR site that can take over within a few minute, this is not available here, the DR site is not working properly, does BURSA have the same requirement as bank and financial institution to perform regular DR trial and report the result to BNM, if so there maybe someone that lie in the report. If there is no requirement for Bursa to have have hot standby DR system i will be shock, such a important part of our country economy is at risk.

here is official timeline

user posted image

http://biz.thestar.com.my/news/story.asp?f...38&sec=business

CIO said they are using HP NonStop server, this is fault tolerate server, all the part in the system is have redundancy there is no single point of failure.

http://en.wikipedia.org/wiki/NonStop
*
robertngo,

A NonStop aka Tandem computer has NO SINGLE point of failure. So, a single hard disk failure cannot bring the system down. In summary, there is something WRONG here.

Dreamer

robertngo
post Jul 5 2008, 09:22 AM

Look at all my stars!!
*******
Senior Member
4,027 posts

Joined: Oct 2004


QUOTE(dreamer101 @ Jul 4 2008, 08:49 PM)
robertngo,

A NonStop aka Tandem computer has NO SINGLE point of failure.  So, a single hard disk failure cannot bring the system down.  In summary, there is something WRONG here.

Dreamer
*
yah i work with Nonstop server before, that is why i find it very hard to believe the story it said the replacement disk cause other hard disk and CPU to fail which will be really strange.

This post has been edited by robertngo: Jul 5 2008, 09:26 AM
fyire
post Jul 6 2008, 12:18 AM

Look at all my stars!!
Group Icon
VIP
9,270 posts

Joined: Jan 2003
From: Somewhere out there
QUOTE(robertngo @ Jul 4 2008, 05:25 PM)
this is not a entry level Dell Server, the trading system are running on mainframe class hardware, i find it really hard to believe a single failed harddisk cause this problem, even if some of the disk and CPU is down it can still keep on running, also BNM have requirement all financial company have DR site that can take over within a few minute, this is not available here, the DR site is not working properly, does BURSA have the same requirement as bank and financial institution to perform regular DR trial and report the result to BNM, if so there maybe someone that lie in the report. If there is no requirement for Bursa to have have hot standby DR system i will be shock, such a important part of our country economy is at risk.

here is official timeline

user posted image

http://biz.thestar.com.my/news/story.asp?f...38&sec=business

CIO said they are using HP NonStop server, this is fault tolerate server, all the part in the system is have redundancy there is no single point of failure.

http://en.wikipedia.org/wiki/NonStop
*
well, that's the first thing that came to mind too, that the reasonings that they've provided for failure's more appropriate for the downtime of entry level servers.
dreamer101
post Jul 6 2008, 12:30 AM

10k Club
Group Icon
Elite
15,855 posts

Joined: Jan 2003
QUOTE(fyire @ Jul 6 2008, 12:18 AM)
well, that's the first thing that came to mind too, that the reasonings that they've provided for failure's more appropriate for the downtime of entry level servers.
*
fyire,

In conclusion, some Malaysian IT people are NOT capable of using the the Nonstop server properly.

"First world infrastructure, third world process and mentality"

Dreamer

This post has been edited by dreamer101: Jul 6 2008, 01:43 AM
robertngo
post Jul 6 2008, 03:49 AM

Look at all my stars!!
*******
Senior Member
4,027 posts

Joined: Oct 2004


QUOTE
To ensure this, Bursa is mulling over the idea of an automated start-up of its back-up system in a situation when a partial or single point at the primary site fails.

Chief information officer Yew Kim Keong said trials on its recovery system were previously done for the scenario of a total system breakdown and not when a single point failed.

This happened on Thursday when the derivatives and bond trading were still operating despite the failure in the equities trading system. As such, the exchange decided to only switch on that part to the back-up system.

However, the synchronisation of data between the primary site and back-up system was longer than the anticipated three hours, stopping Bursa from resuming trading in the afternoon, Yew said.

Hewlett-Packard (M) Sdn Bhd managing director T.F. Chong said the design of the computer system depended on the business environment and the requirements of the respective stock exchange.

Hewlett-Packard is the vendor of the HP Non-Stop Hardware, which is the existing architecture being used by Bursa.

On whether cost was a constraint for Bursa to have a variety of situational recovery trials, Yusli said the exchange had to be practical.

“We could spend all our time testing on business continuity process (BCP) or draw a line somewhere. Our BCP is in line with international practice,” he said.

Bursa spends the most on technology after manpower, which stood at 30% and 50% respectively of operating cost. Total operating cost amounts to RM200mil.


the BCP plan fail to prepare for the situation of single trading system and recovery was not done immediatly, also they dont have live data replication setup between production and DR site and require more than three hour to sync their production data. This is a big problem for a trading system, if you lose more than three hour time you will be losing entire trading session. Banks have been require by BNM to have live data replication to DR for instance recovery during disaster, why does a vital financial institution like Bursa is not required to do the same??

dreamer101
post Jul 6 2008, 04:43 AM

10k Club
Group Icon
Elite
15,855 posts

Joined: Jan 2003
QUOTE(robertngo @ Jul 6 2008, 03:49 AM)
the BCP plan fail to prepare for the situation of single trading system and recovery was not done immediatly, also they dont have live data replication setup between production and DR site and require more than three hour to sync their production data. This is a big problem for a trading system, if you lose more than three hour time you will be losing entire trading session. Banks have been require by BNM to have live data replication to DR for instance recovery during disaster, why does a vital financial institution like Bursa is not required to do the same??
*
robertngo,

1) Why do we have whole radar station burn down because nobody check the fuse of the fire alarm??

2) Why do KLIA out of power for 4 hours even though we supposed to have redundant power feed from 2 power suppliers and 2 power networks??

Understand this and you would understand ALMOST everything WRONG about Malaysia. Why Malaysia is NOT going forward in spite of the tremendous amount of resources that we have.

Dreamer
cherroy
post Jul 6 2008, 06:56 AM

20k VIP Club
Group Icon
Staff
25,802 posts

Joined: Jan 2003
From: Penang


QUOTE(dreamer101 @ Jul 6 2008, 12:30 AM)
fyire,

In conclusion, some Malaysian IT people are NOT capable of using the the Nonstop server properly.

"First world infrastructure, third world process and mentality"

Dreamer
*
One of the better possibility answer for the root cause. rclxms.gif

To be fair, I would change from some Malaysian IT people to Bursa IT people or its vendor that responsible for that as don't know whether they sub-con out or manage their own in term of the computer server issue.


dreamer101
post Jul 6 2008, 10:26 AM

10k Club
Group Icon
Elite
15,855 posts

Joined: Jan 2003
QUOTE(cherroy @ Jul 6 2008, 06:56 AM)
One of the better possibility answer for the root cause.  rclxms.gif

To be fair, I would change from some Malaysian IT people to Bursa IT people or its vendor that responsible for that as don't know whether they sub-con out or manage their own in term of the computer server issue.
*
cherroy,

1) In this case, it is BURSA IT people. But, I had seen ENOUGH examples that to know this problem is MORE wide spread than the bursa.

2) Recently, some Alliance Bank branches are out for one or 2 days too.

As per Godfather movie, "Fish rot from the head first".

Dreamer

The SHORT answer is to manage and run ANY kind of complex system, you NEED GOOD EXPERIENCED PEOPLE to support IT. So, essentially, you have 2 choices:

A) Train your internal people and pay them well enough so that they stay technical

Or

B) Outsource to someone else but you still NEED GOOD people to monitor them

Malaysian culture is WE spent a lot of MONEY on the hardware. But, we neither train, pay and keep REAL TECHNICAL people to support and maintain those hardware. There is NO career path in Malaysia for senior technical people with a lot of experience. Some took the short cut and hire ex-pats. But, they REFUSE to hire oversea Malaysians with similar or better experience. The REASON is they think it is EASIER to control the ex-pats. But, that is harmful for the future of the country. In the end, we are in this catch 22 situation that we NEVER get out

A) There are NO GOOD EXPERIENCE TECHNICAL people that deserve very high pay

B) Technical people do not stay technical because they make more money as manager.

In the end, nothing changes and Malaysia cannot go to the next level of IT capabilities.


This post has been edited by dreamer101: Jul 6 2008, 10:41 AM
howszat
post Jul 6 2008, 11:01 AM

Look at all my stars!!
*******
Senior Member
2,932 posts

Joined: Sep 2007
There are just really basic things from an IT point of view that doesn't make sense in their timeline:

>> Faulty disk replaced

It is now getting standard for servers to have redundancy built-in, especially in the disk storage sub-systems. With RAID arrays, a single or even multiple disk failures don't cause system outages. You can replace those faulty items at some later convenient time, and with hot-swappable/pluggable systems, you don't even need to shut the system down.

>> Backup site start-up process takes longer than expected.

They obviously didn't test this properly and/or regularly enough, or whatever they have implemented is just not appropriate.

Sounds likes there's a few incompetent people in their IT.
cherroy
post Jul 6 2008, 11:19 AM

20k VIP Club
Group Icon
Staff
25,802 posts

Joined: Jan 2003
From: Penang


QUOTE(dreamer101 @ Jul 6 2008, 10:26 AM)
cherroy,

1)  In this case, it is BURSA IT people.   But, I had seen ENOUGH examples that to know this problem is MORE wide spread than the bursa.

2) Recently, some Alliance Bank branches are out for one or 2 days too.

As per Godfather movie, "Fish rot from the head first".

Dreamer

The SHORT answer is to manage and run ANY kind of complex system, you NEED GOOD EXPERIENCED PEOPLE to support IT.  So, essentially, you have 2 choices:

A) Train your internal people and pay them well enough so that they stay technical

Or

B) Outsource to someone else but you still NEED GOOD people to monitor them

Malaysian culture is WE spent a lot of MONEY on the hardware.  But, we neither train, pay and keep REAL TECHNICAL people to support and maintain those hardware.  There is NO career path in Malaysia for senior technical people with a lot of experience.  Some took the short cut and hire ex-pats.  But, they REFUSE to hire oversea Malaysians with similar or better experience.  The REASON is they think it is EASIER to control the ex-pats.  But, that is harmful for the future of the country.  In the end, we are in this catch 22 situation that we NEVER get out

A) There are NO GOOD EXPERIENCE TECHNICAL people that deserve very high pay

B) Technical people do not stay technical because they make more money as manager.

In the end, nothing changes and Malaysia cannot go to the next level of IT capabilities.
*
I don't know Alliance Bank has such serious issue, never deal with them before either. Very poor indeed.

I think this problem in Malaysia (I don't know elsewhere)

Technical people doesn't being paid well in general as most Malaysia company don't view technical people as a valuable asset. Most of the time we saw experienced technician or engineer up to 10 years of expertise and experience, their wages also can't match the a newly hired manager with not much experience. Experience and expertise is an intangible asset for a company especially in dealing IT stuff which is much more difficult to be replaced easily compared to ordinary manager. (There are some very good manager which is also valuable to the company, what I mentioned is those ordinary manager).

Also, one of most important issue, is that some company culture, promotion is done based on preferential, not because of merit point.

QUOTE(howszat @ Jul 6 2008, 11:01 AM)
There are just really basic things from an IT point of view that doesn't make sense in their timeline:

>> Faulty disk replaced

It is now getting standard for servers to have redundancy built-in, especially in the disk storage sub-systems. With RAID arrays, a single or even multiple disk failures don't cause system outages. You can replace those faulty items at some later convenient time, and with hot-swappable/pluggable systems, you don't even need to shut the system down.

>> Backup site start-up process takes longer than expected.

They obviously didn't test this properly and/or regularly enough, or whatever they have implemented is just not appropriate.

Sounds likes there's a few incompetent people in their IT.
*
Hardware is there, just how people manage it. This is more on management issue on IT side rather than hardware issue.

This post has been edited by cherroy: Jul 6 2008, 11:24 AM
howszat
post Jul 6 2008, 11:30 AM

Look at all my stars!!
*******
Senior Member
2,932 posts

Joined: Sep 2007
QUOTE(cherroy @ Jul 6 2008, 11:19 AM)

Hardware is there, just how people manage it. This is more on management issue on IT side rather than hardware issue.
*
Ultimately it comes down to people/management. But those high-end, expensive, hardware are not supposed to behave in the way they described.
gilabola
post Jul 6 2008, 11:36 AM

On my way
****
Senior Member
670 posts

Joined: Jan 2005


Bursa's server that crashed was a HP NonStop K-Series which is already obsolete and 2 generations old.

HP released a new line of NonStop S-Series to replace the K-Series back in the late 1990s. In 2008, HP phased out the the S-Series with its new HP's NonStop Integrity line which runs on Intel Itanium processors. (The K-Series and S-Series runs on MIPS processors)

The Edge had an article this week that implies that Bursa stopped investing on the old server because it was replacing the system with a new trading system from Atos Euronext... but this project was delayed for 2 years...so the old K-Series machine had to keep running and finally died smile.gif



This post has been edited by gilabola: Jul 6 2008, 11:45 AM
howszat
post Jul 6 2008, 11:41 AM

Look at all my stars!!
*******
Senior Member
2,932 posts

Joined: Sep 2007
QUOTE(gilabola @ Jul 6 2008, 11:36 AM)
Bursa's server that crashed was a HP  NonStop K-Series which is already obsolete and 2 generations old.  The K-Series has already been replaced by the S-Series back in the late 1990s.  The S-Series was replaced this year by HP's NonStop Integrity line which runs on Intel Itanium processors.  The K-Series and S-Series runs on MIPS processors.

The Edge had an article that implies that Bursa stopped investing on the old server because it was replacing the system with a new trading system from Atos Euronext... but this project was delayed for 2 years...so the old machine had to keep running and finally died smile.gif
*
Ah, that could explain a lot then... biggrin.gif
cherroy
post Jul 6 2008, 04:36 PM

20k VIP Club
Group Icon
Staff
25,802 posts

Joined: Jan 2003
From: Penang


QUOTE(gilabola @ Jul 6 2008, 11:36 AM)
Bursa's server that crashed was a HP  NonStop K-Series which is already obsolete and 2 generations old. 

HP released a new line of NonStop S-Series to replace the K-Series back in the late 1990s. In 2008, HP phased out the the S-Series with its new HP's NonStop Integrity line which runs on Intel Itanium processors.  (The K-Series and S-Series runs on MIPS processors)

The Edge had an article this week that implies that Bursa stopped investing on the old server because it was replacing the system with a new trading system from Atos Euronext... but this project was delayed for 2 years...so the old K-Series machine had to keep running and finally died smile.gif
*
Use some server that obsolete 10 years ago? sweat.gif

Bursa make hefty profit every year even giving special dividend and capital repayment to the shareholders, in term of financial wise should be no problem for them to invest up to date hardware. wink.gif

4 Pages < 1 2 3 4 >Top
 

Change to:
| Lo-Fi Version
0.0256sec    0.88    5 queries    GZIP Disabled
Time is now: 21st December 2025 - 02:39 PM